2023-04-18 01:41:33

by Daniel Rosenberg

[permalink] [raw]
Subject: [RFC PATCH bpf-next v3 00/37] FUSE BPF: A Stacked Filesystem Extension for FUSE

These patches extend FUSE to be able to act as a stacked filesystem. This
allows pure passthrough, where the fuse file system simply reflects the lower
filesystem, and also allows optional pre and post filtering in BPF and/or the
userspace daemon as needed. This can dramatically reduce or even eliminate
transitions to and from userspace.

In this patch set, I've reworked the bpf code to add a new struct_op type
instead of a new program type, and used new kfuncs in place of new helpers.
Additionally, it now uses dynptrs for variable sized buffers. The first three
patches are repeats of a previous patch set which I have not yet adjusted for
comments. I plan to adjust those and submit them separately with fixes, but
wanted to have the current fuse-bpf code visible before then.

Patches 4-7 mostly rearrange existing code to remove noise from the main patch.
Patch 8 contains the main sections of fuse-bpf
Patches 9-25 implementing most FUSE functions as operations on a lower
filesystem. From patch 25, you can run fuse as a passthrough filesystem.
Patches 26-32 provide bpf functionality so that you can alter fuse parameters
via fuse_op programs.
Patch 33 extends this to userspace, and patches 34-37 add some testing
functionality.

There's definitely a lot of cleanup and some restructuring I would like to do.
In the current form, I could get rid of the large macro in place of a function
that takes a struct that groups a bunch of function pointers, although I'm not
sure a function that takes three void*'s is much better than the macro... I'm
definitely open to suggestions on how to clean that up.

This changes the format of adding a backing file/bpf slightly from v2. fuse_op
programs are specified by name, limited to 15 characters. The block added to
fuse_bpf_entires has been increased to compensate. This adds one more unused
field when specifying the backing file.

Lookups responses that add a backing file must go through an ioctl interface.
This is to prevent any attempts at fooling priveledged programs with fd
trickery.

Currently, there are two types of fuse_bpf_entry. One for passing the fuse_op
program you wish to use, specified by name, and one for passing the fd of the
backing file you'd like to associate with the given lookup. In the future, this
may be extended to a more complicated system allowing for multiple bpf programs
or backing files. This would come with kfuncs for bpf to indicate which backing
file should be acted upon. Multiple bpf programs would allow chaining existing
programs to extend functionality without requiring an entirely new program set.

You can run this without needing to set up a userspace daemon by adding these
mount options: root_dir=[fd],no_daemon where fd is an open file descriptor
pointing to the folder you'd like to use as the root directory. The fd can be
immediately closed after mounting. You may also set a root_bpf program by
setting root_bpf=[fuse_op name] after registering a fuse_op program.
This is useful for running various fs tests.

This patch set is against bpf-next

The main changes for v3:
Restructured around struct_op programs
Using dynptrs instead of packets
Using kfuncs instead of new helpers
Selftests now use skel for loading

Alessio Balsini (1):
fs: Generic function to convert iocb to rw flags

Daniel Rosenberg (36):
bpf: verifier: Accept dynptr mem as mem in herlpers
bpf: Allow NULL buffers in bpf_dynptr_slice(_rw)
selftests/bpf: Test allowing NULL buffer in dynptr slice
fuse-bpf: Update fuse side uapi
fuse-bpf: Add data structures for fuse-bpf
fuse-bpf: Prepare for fuse-bpf patch
fuse: Add fuse-bpf, a stacked fs extension for FUSE
fuse-bpf: Add ioctl interface for /dev/fuse
fuse-bpf: Don't support export_operations
fuse-bpf: Add support for access
fuse-bpf: Partially add mapping support
fuse-bpf: Add lseek support
fuse-bpf: Add support for fallocate
fuse-bpf: Support file/dir open/close
fuse-bpf: Support mknod/unlink/mkdir/rmdir
fuse-bpf: Add support for read/write iter
fuse-bpf: support readdir
fuse-bpf: Add support for sync operations
fuse-bpf: Add Rename support
fuse-bpf: Add attr support
fuse-bpf: Add support for FUSE_COPY_FILE_RANGE
fuse-bpf: Add xattr support
fuse-bpf: Add symlink/link support
fuse-bpf: allow mounting with no userspace daemon
bpf: Increase struct_op limits
fuse-bpf: Add fuse-bpf constants
WIP: bpf: Add fuse_ops struct_op programs
fuse-bpf: Export Functions
fuse: Provide registration functions for fuse-bpf
fuse-bpf: Set fuse_ops at mount or lookup time
fuse-bpf: Call bpf for pre/post filters
fuse-bpf: Add userspace pre/post filters
WIP: fuse-bpf: add error_out
tools: Add FUSE, update bpf includes
fuse-bpf: Add selftests
fuse: Provide easy way to test fuse struct_op call

Documentation/bpf/kfuncs.rst | 23 +-
fs/fuse/Kconfig | 8 +
fs/fuse/Makefile | 1 +
fs/fuse/backing.c | 4241 +++++++++++++++++
fs/fuse/bpf_register.c | 209 +
fs/fuse/control.c | 2 +-
fs/fuse/dev.c | 85 +-
fs/fuse/dir.c | 344 +-
fs/fuse/file.c | 63 +-
fs/fuse/fuse_i.h | 495 +-
fs/fuse/inode.c | 360 +-
fs/fuse/ioctl.c | 2 +-
fs/fuse/readdir.c | 5 +
fs/fuse/xattr.c | 18 +
fs/overlayfs/file.c | 23 +-
include/linux/bpf.h | 2 +-
include/linux/bpf_fuse.h | 283 ++
include/linux/fs.h | 5 +
include/uapi/linux/bpf.h | 12 +
include/uapi/linux/fuse.h | 41 +
kernel/bpf/Makefile | 4 +
kernel/bpf/bpf_fuse.c | 241 +
kernel/bpf/bpf_struct_ops.c | 6 +-
kernel/bpf/bpf_struct_ops_types.h | 4 +
kernel/bpf/btf.c | 1 +
kernel/bpf/helpers.c | 32 +-
kernel/bpf/verifier.c | 32 +
tools/include/uapi/linux/bpf.h | 12 +
tools/include/uapi/linux/fuse.h | 1135 +++++
.../testing/selftests/bpf/prog_tests/dynptr.c | 1 +
.../selftests/bpf/progs/dynptr_success.c | 21 +
.../selftests/filesystems/fuse/.gitignore | 2 +
.../selftests/filesystems/fuse/Makefile | 189 +
.../testing/selftests/filesystems/fuse/OWNERS | 2 +
.../selftests/filesystems/fuse/bpf_common.h | 51 +
.../selftests/filesystems/fuse/bpf_loader.c | 597 +++
.../testing/selftests/filesystems/fuse/fd.txt | 21 +
.../selftests/filesystems/fuse/fd_bpf.bpf.c | 397 ++
.../selftests/filesystems/fuse/fuse_daemon.c | 300 ++
.../selftests/filesystems/fuse/fuse_test.c | 2412 ++++++++++
.../filesystems/fuse/struct_op_test.bpf.c | 642 +++
.../selftests/filesystems/fuse/test.bpf.c | 996 ++++
.../filesystems/fuse/test_framework.h | 172 +
.../selftests/filesystems/fuse/test_fuse.h | 494 ++
44 files changed, 13755 insertions(+), 231 deletions(-)
create mode 100644 fs/fuse/backing.c
create mode 100644 fs/fuse/bpf_register.c
create mode 100644 include/linux/bpf_fuse.h
create mode 100644 kernel/bpf/bpf_fuse.c
create mode 100644 tools/include/uapi/linux/fuse.h
create mode 100644 tools/testing/selftests/filesystems/fuse/.gitignore
create mode 100644 tools/testing/selftests/filesystems/fuse/Makefile
create mode 100644 tools/testing/selftests/filesystems/fuse/OWNERS
create mode 100644 tools/testing/selftests/filesystems/fuse/bpf_common.h
create mode 100644 tools/testing/selftests/filesystems/fuse/bpf_loader.c
create mode 100644 tools/testing/selftests/filesystems/fuse/fd.txt
create mode 100644 tools/testing/selftests/filesystems/fuse/fd_bpf.bpf.c
create mode 100644 tools/testing/selftests/filesystems/fuse/fuse_daemon.c
create mode 100644 tools/testing/selftests/filesystems/fuse/fuse_test.c
create mode 100644 tools/testing/selftests/filesystems/fuse/struct_op_test.bpf.c
create mode 100644 tools/testing/selftests/filesystems/fuse/test.bpf.c
create mode 100644 tools/testing/selftests/filesystems/fuse/test_framework.h
create mode 100644 tools/testing/selftests/filesystems/fuse/test_fuse.h


base-commit: 49859de997c3115b85544bce6b6ceab60a7fabc4
--
2.40.0.634.g4ca3ef3211-goog


2023-04-18 01:41:50

by Daniel Rosenberg

[permalink] [raw]
Subject: [RFC PATCH v3 01/37] bpf: verifier: Accept dynptr mem as mem in herlpers

This allows using memory retrieved from dynptrs with helper functions
that accept ARG_PTR_TO_MEM. For instance, results from bpf_dynptr_data
can be passed along to bpf_strncmp.

Signed-off-by: Daniel Rosenberg <[email protected]>
---
kernel/bpf/verifier.c | 4 ++++
1 file changed, 4 insertions(+)

diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 1e05355facdc..ebc638bfed87 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -7128,12 +7128,16 @@ static int check_reg_type(struct bpf_verifier_env *env, u32 regno,
* ARG_PTR_TO_MEM + MAYBE_NULL is compatible with PTR_TO_MEM and PTR_TO_MEM + MAYBE_NULL,
* but ARG_PTR_TO_MEM is compatible only with PTR_TO_MEM but NOT with PTR_TO_MEM + MAYBE_NULL
*
+ * ARG_PTR_TO_MEM is compatible with PTR_TO_MEM that is tagged with a dynptr type.
+ *
* Therefore we fold these flags depending on the arg_type before comparison.
*/
if (arg_type & MEM_RDONLY)
type &= ~MEM_RDONLY;
if (arg_type & PTR_MAYBE_NULL)
type &= ~PTR_MAYBE_NULL;
+ if (base_type(arg_type) == ARG_PTR_TO_MEM)
+ type &= ~DYNPTR_TYPE_FLAG_MASK;

if (meta->func_id == BPF_FUNC_kptr_xchg && type & MEM_ALLOC)
type &= ~MEM_ALLOC;
--
2.40.0.634.g4ca3ef3211-goog

2023-04-18 01:41:54

by Daniel Rosenberg

[permalink] [raw]
Subject: [RFC PATCH v3 03/37] selftests/bpf: Test allowing NULL buffer in dynptr slice

bpf_dynptr_slice(_rw) no longer requires a buffer for verification. If the
buffer is needed, but not present, the function will return NULL.

Signed-off-by: Daniel Rosenberg <[email protected]>
---
.../testing/selftests/bpf/prog_tests/dynptr.c | 1 +
.../selftests/bpf/progs/dynptr_success.c | 21 +++++++++++++++++++
2 files changed, 22 insertions(+)

diff --git a/tools/testing/selftests/bpf/prog_tests/dynptr.c b/tools/testing/selftests/bpf/prog_tests/dynptr.c
index d176c34a7d2e..db22cad32657 100644
--- a/tools/testing/selftests/bpf/prog_tests/dynptr.c
+++ b/tools/testing/selftests/bpf/prog_tests/dynptr.c
@@ -20,6 +20,7 @@ static struct {
{"test_ringbuf", SETUP_SYSCALL_SLEEP},
{"test_skb_readonly", SETUP_SKB_PROG},
{"test_dynptr_skb_data", SETUP_SKB_PROG},
+ {"test_dynptr_skb_nobuff", SETUP_SKB_PROG},
};

static void verify_success(const char *prog_name, enum test_setup_type setup_type)
diff --git a/tools/testing/selftests/bpf/progs/dynptr_success.c b/tools/testing/selftests/bpf/progs/dynptr_success.c
index b2fa6c47ecc0..a059ed8d4590 100644
--- a/tools/testing/selftests/bpf/progs/dynptr_success.c
+++ b/tools/testing/selftests/bpf/progs/dynptr_success.c
@@ -207,3 +207,24 @@ int test_dynptr_skb_data(struct __sk_buff *skb)

return 1;
}
+
+SEC("?cgroup_skb/egress")
+int test_dynptr_skb_no_buff(struct __sk_buff *skb)
+{
+ struct bpf_dynptr ptr;
+ __u64 *data;
+
+ if (bpf_dynptr_from_skb(skb, 0, &ptr)) {
+ err = 1;
+ return 1;
+ }
+
+ /* This should return NULL. SKB may require a buffer */
+ data = bpf_dynptr_slice(&ptr, 0, NULL, 1);
+ if (data) {
+ err = 2;
+ return 1;
+ }
+
+ return 1;
+}
--
2.40.0.634.g4ca3ef3211-goog

2023-04-18 01:42:04

by Daniel Rosenberg

[permalink] [raw]
Subject: [RFC PATCH v3 05/37] fuse-bpf: Update fuse side uapi

Adds structures which will be used to inform fuse about what it is being
stacked on top of. Once filters are in place, error_in will inform the
post filter if the backing call returned an error.

Signed-off-by: Daniel Rosenberg <[email protected]>
Signed-off-by: Paul Lawrence <[email protected]>
---
include/uapi/linux/fuse.h | 23 +++++++++++++++++++++++
1 file changed, 23 insertions(+)

diff --git a/include/uapi/linux/fuse.h b/include/uapi/linux/fuse.h
index 1b9d0dfae72d..04d96f34e9a1 100644
--- a/include/uapi/linux/fuse.h
+++ b/include/uapi/linux/fuse.h
@@ -607,6 +607,29 @@ struct fuse_entry_out {
struct fuse_attr attr;
};

+#define FUSE_BPF_MAX_ENTRIES 2
+
+enum fuse_bpf_type {
+ FUSE_ENTRY_BACKING = 1,
+ FUSE_ENTRY_BPF = 2,
+ FUSE_ENTRY_REMOVE_BACKING = 3,
+ FUSE_ENTRY_REMOVE_BPF = 4,
+};
+
+#define BPF_FUSE_NAME_MAX 15
+
+struct fuse_bpf_entry_out {
+ uint32_t entry_type;
+ uint32_t unused;
+ union {
+ struct {
+ uint64_t unused2;
+ uint64_t fd;
+ };
+ char name[BPF_FUSE_NAME_MAX + 1];
+ };
+};
+
struct fuse_forget_in {
uint64_t nlookup;
};
--
2.40.0.634.g4ca3ef3211-goog

2023-04-18 01:42:15

by Daniel Rosenberg

[permalink] [raw]
Subject: [RFC PATCH v3 04/37] fs: Generic function to convert iocb to rw flags

From: Alessio Balsini <[email protected]>

OverlayFS implements its own function to translate iocb flags into rw
flags, so that they can be passed into another vfs call.
With commit ce71bfea207b4 ("fs: align IOCB_* flags with RWF_* flags")
Jens created a 1:1 matching between the iocb flags and rw flags,
simplifying the conversion.

Reduce the OverlayFS code by making the flag conversion function generic
and reusable.

Signed-off-by: Alessio Balsini <[email protected]>
---
fs/overlayfs/file.c | 23 +++++------------------
include/linux/fs.h | 5 +++++
2 files changed, 10 insertions(+), 18 deletions(-)

diff --git a/fs/overlayfs/file.c b/fs/overlayfs/file.c
index 7c04f033aadd..759893e4da04 100644
--- a/fs/overlayfs/file.c
+++ b/fs/overlayfs/file.c
@@ -15,6 +15,8 @@
#include <linux/fs.h>
#include "overlayfs.h"

+#define OVL_IOCB_MASK (IOCB_DSYNC | IOCB_HIPRI | IOCB_NOWAIT | IOCB_SYNC)
+
struct ovl_aio_req {
struct kiocb iocb;
refcount_t ref;
@@ -241,22 +243,6 @@ static void ovl_file_accessed(struct file *file)
touch_atime(&file->f_path);
}

-static rwf_t ovl_iocb_to_rwf(int ifl)
-{
- rwf_t flags = 0;
-
- if (ifl & IOCB_NOWAIT)
- flags |= RWF_NOWAIT;
- if (ifl & IOCB_HIPRI)
- flags |= RWF_HIPRI;
- if (ifl & IOCB_DSYNC)
- flags |= RWF_DSYNC;
- if (ifl & IOCB_SYNC)
- flags |= RWF_SYNC;
-
- return flags;
-}
-
static inline void ovl_aio_put(struct ovl_aio_req *aio_req)
{
if (refcount_dec_and_test(&aio_req->ref)) {
@@ -316,7 +302,8 @@ static ssize_t ovl_read_iter(struct kiocb *iocb, struct iov_iter *iter)
old_cred = ovl_override_creds(file_inode(file)->i_sb);
if (is_sync_kiocb(iocb)) {
ret = vfs_iter_read(real.file, iter, &iocb->ki_pos,
- ovl_iocb_to_rwf(iocb->ki_flags));
+ iocb_to_rw_flags(iocb->ki_flags,
+ OVL_IOCB_MASK));
} else {
struct ovl_aio_req *aio_req;

@@ -380,7 +367,7 @@ static ssize_t ovl_write_iter(struct kiocb *iocb, struct iov_iter *iter)
if (is_sync_kiocb(iocb)) {
file_start_write(real.file);
ret = vfs_iter_write(real.file, iter, &iocb->ki_pos,
- ovl_iocb_to_rwf(ifl));
+ iocb_to_rw_flags(ifl, OVL_IOCB_MASK));
file_end_write(real.file);
/* Update size */
ovl_copyattr(inode);
diff --git a/include/linux/fs.h b/include/linux/fs.h
index c85916e9f7db..c849074f44b7 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -3009,6 +3009,11 @@ static inline int kiocb_set_rw_flags(struct kiocb *ki, rwf_t flags)
return 0;
}

+static inline rwf_t iocb_to_rw_flags(int ifl, int iocb_mask)
+{
+ return ifl & iocb_mask;
+}
+
static inline ino_t parent_ino(struct dentry *dentry)
{
ino_t res;
--
2.40.0.634.g4ca3ef3211-goog

2023-04-18 01:42:23

by Daniel Rosenberg

[permalink] [raw]
Subject: [RFC PATCH v3 06/37] fuse-bpf: Add data structures for fuse-bpf

These structures will be used to interact between the fuse bpf calls and
normal userspace calls

Signed-off-by: Daniel Rosenberg <[email protected]>
---
include/linux/bpf_fuse.h | 84 ++++++++++++++++++++++++++++++++++++++++
1 file changed, 84 insertions(+)
create mode 100644 include/linux/bpf_fuse.h

diff --git a/include/linux/bpf_fuse.h b/include/linux/bpf_fuse.h
new file mode 100644
index 000000000000..ce8b1b347496
--- /dev/null
+++ b/include/linux/bpf_fuse.h
@@ -0,0 +1,84 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright 2022 Google LLC.
+ */
+
+#ifndef _BPF_FUSE_H
+#define _BPF_FUSE_H
+
+#include <linux/types.h>
+#include <linux/fuse.h>
+
+struct fuse_buffer {
+ void *data;
+ unsigned size;
+ unsigned alloc_size;
+ unsigned max_size;
+ int flags;
+};
+
+/* These flags are used internally to track information about the fuse buffers.
+ * Fuse sets some of the flags in init. The helper functions sets others, depending on what
+ * was requested by the bpf program.
+ */
+// Flags set by FUSE
+#define BPF_FUSE_IMMUTABLE (1 << 0) // Buffer may not be written to
+#define BPF_FUSE_VARIABLE_SIZE (1 << 1) // Buffer length may be changed (growth requires alloc)
+#define BPF_FUSE_MUST_ALLOCATE (1 << 2) // Buffer must be re allocated before allowing writes
+
+// Flags set by helper function
+#define BPF_FUSE_MODIFIED (1 << 3) // The helper function allowed writes to the buffer
+#define BPF_FUSE_ALLOCATED (1 << 4) // The helper function allocated the buffer
+
+/*
+ * BPF Fuse Args
+ *
+ * Used to translate between bpf program parameters and their userspace equivalent calls.
+ * Variable sized arguments are held in fuse_buffers. To access these, bpf programs must
+ * use kfuncs to access them as dynptrs.
+ *
+ */
+
+#define FUSE_MAX_ARGS_IN 3
+#define FUSE_MAX_ARGS_OUT 2
+
+struct bpf_fuse_arg {
+ union {
+ void *value;
+ struct fuse_buffer *buffer;
+ };
+ unsigned size;
+ bool is_buffer;
+};
+
+struct bpf_fuse_meta_info {
+ uint64_t nodeid;
+ uint32_t opcode;
+ uint32_t error_in;
+};
+
+struct bpf_fuse_args {
+ struct bpf_fuse_meta_info info;
+ uint32_t in_numargs;
+ uint32_t out_numargs;
+ uint32_t flags;
+ struct bpf_fuse_arg in_args[FUSE_MAX_ARGS_IN];
+ struct bpf_fuse_arg out_args[FUSE_MAX_ARGS_OUT];
+};
+
+// Mirrors for struct fuse_args flags
+#define FUSE_BPF_FORCE (1 << 0)
+#define FUSE_BPF_OUT_ARGVAR (1 << 6)
+#define FUSE_BPF_IS_LOOKUP (1 << 11)
+
+static inline void *bpf_fuse_arg_value(const struct bpf_fuse_arg *arg)
+{
+ return arg->is_buffer ? arg->buffer : arg->value;
+}
+
+static inline unsigned bpf_fuse_arg_size(const struct bpf_fuse_arg *arg)
+{
+ return arg->is_buffer ? arg->buffer->size : arg->size;
+}
+
+#endif /* _BPF_FUSE_H */
--
2.40.0.634.g4ca3ef3211-goog

2023-04-18 01:42:31

by Daniel Rosenberg

[permalink] [raw]
Subject: [RFC PATCH v3 02/37] bpf: Allow NULL buffers in bpf_dynptr_slice(_rw)

bpf_dynptr_slice(_rw) uses a user provided buffer if it can not provide
a pointer to a block of contiguous memory. This buffer is unused in the
case of local dynptrs, and may be unused in other cases as well. There
is no need to require the buffer, as the kfunc can just return NULL if
it was needed and not provided.

This adds another kfunc annotation, __opt, which combines with __sz and
__szk to allow the buffer associated with the size to be NULL. If the
buffer is NULL, the verifier does not check that the buffer is of
sufficient size.

Signed-off-by: Daniel Rosenberg <[email protected]>
---
Documentation/bpf/kfuncs.rst | 23 ++++++++++++++++++++++-
kernel/bpf/helpers.c | 32 ++++++++++++++++++++------------
kernel/bpf/verifier.c | 19 +++++++++++++++++++
3 files changed, 61 insertions(+), 13 deletions(-)

diff --git a/Documentation/bpf/kfuncs.rst b/Documentation/bpf/kfuncs.rst
index ea2516374d92..7a3d9de5f315 100644
--- a/Documentation/bpf/kfuncs.rst
+++ b/Documentation/bpf/kfuncs.rst
@@ -100,7 +100,7 @@ Hence, whenever a constant scalar argument is accepted by a kfunc which is not a
size parameter, and the value of the constant matters for program safety, __k
suffix should be used.

-2.2.2 __uninit Annotation
+2.2.3 __uninit Annotation
-------------------------

This annotation is used to indicate that the argument will be treated as
@@ -117,6 +117,27 @@ Here, the dynptr will be treated as an uninitialized dynptr. Without this
annotation, the verifier will reject the program if the dynptr passed in is
not initialized.

+2.2.4 __opt Annotation
+-------------------------
+
+This annotation is used to indicate that the buffer associated with an __sz or __szk
+argument may be null. If the function is passed a nullptr in place of the buffer,
+the verifier will not check that length is appropriate for the buffer. The kfunc is
+responsible for checking if this buffer is null before using it.
+
+An example is given below::
+
+ __bpf_kfunc void *bpf_dynptr_slice(..., void *buffer__opt, u32 buffer__szk)
+ {
+ ...
+ }
+
+Here, the buffer may be null. If buffer is not null, it at least of size buffer_szk.
+Either way, the returned buffer is either NULL, or of size buffer_szk. Without this
+annotation, the verifier will reject the program if a null pointer is passed in with
+a nonzero size.
+
+
.. _BPF_kfunc_nodef:

2.3 Using an existing kernel function
diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c
index 00e5fb0682ac..bfb75ecacb76 100644
--- a/kernel/bpf/helpers.c
+++ b/kernel/bpf/helpers.c
@@ -2167,13 +2167,15 @@ __bpf_kfunc struct task_struct *bpf_task_from_pid(s32 pid)
* bpf_dynptr_slice() - Obtain a read-only pointer to the dynptr data.
* @ptr: The dynptr whose data slice to retrieve
* @offset: Offset into the dynptr
- * @buffer: User-provided buffer to copy contents into
- * @buffer__szk: Size (in bytes) of the buffer. This is the length of the
- * requested slice. This must be a constant.
+ * @buffer__opt: User-provided buffer to copy contents into. May be NULL
+ * @buffer__szk: Size (in bytes) of the buffer if present. This is the
+ * length of the requested slice. This must be a constant.
*
* For non-skb and non-xdp type dynptrs, there is no difference between
* bpf_dynptr_slice and bpf_dynptr_data.
*
+ * If buffer__opt is NULL, the call will fail if buffer_opt was needed.
+ *
* If the intention is to write to the data slice, please use
* bpf_dynptr_slice_rdwr.
*
@@ -2190,7 +2192,7 @@ __bpf_kfunc struct task_struct *bpf_task_from_pid(s32 pid)
* direct pointer)
*/
__bpf_kfunc void *bpf_dynptr_slice(const struct bpf_dynptr_kern *ptr, u32 offset,
- void *buffer, u32 buffer__szk)
+ void *buffer__opt, u32 buffer__szk)
{
enum bpf_dynptr_type type;
u32 len = buffer__szk;
@@ -2210,15 +2212,19 @@ __bpf_kfunc void *bpf_dynptr_slice(const struct bpf_dynptr_kern *ptr, u32 offset
case BPF_DYNPTR_TYPE_RINGBUF:
return ptr->data + ptr->offset + offset;
case BPF_DYNPTR_TYPE_SKB:
- return skb_header_pointer(ptr->data, ptr->offset + offset, len, buffer);
+ if (!buffer__opt)
+ return NULL;
+ return skb_header_pointer(ptr->data, ptr->offset + offset, len, buffer__opt);
case BPF_DYNPTR_TYPE_XDP:
{
void *xdp_ptr = bpf_xdp_pointer(ptr->data, ptr->offset + offset, len);
if (xdp_ptr)
return xdp_ptr;

- bpf_xdp_copy_buf(ptr->data, ptr->offset + offset, buffer, len, false);
- return buffer;
+ if (!buffer__opt)
+ return NULL;
+ bpf_xdp_copy_buf(ptr->data, ptr->offset + offset, buffer__opt, len, false);
+ return buffer__opt;
}
default:
WARN_ONCE(true, "unknown dynptr type %d\n", type);
@@ -2230,13 +2236,15 @@ __bpf_kfunc void *bpf_dynptr_slice(const struct bpf_dynptr_kern *ptr, u32 offset
* bpf_dynptr_slice_rdwr() - Obtain a writable pointer to the dynptr data.
* @ptr: The dynptr whose data slice to retrieve
* @offset: Offset into the dynptr
- * @buffer: User-provided buffer to copy contents into
- * @buffer__szk: Size (in bytes) of the buffer. This is the length of the
- * requested slice. This must be a constant.
+ * @buffer__opt: User-provided buffer to copy contents into. May be NULL
+ * @buffer__szk: Size (in bytes) of the buffer if present. This is the
+ * length of the requested slice. This must be a constant.
*
* For non-skb and non-xdp type dynptrs, there is no difference between
* bpf_dynptr_slice and bpf_dynptr_data.
*
+ * If buffer__opt is NULL, the call will fail if buffer_opt was needed.
+ *
* The returned pointer is writable and may point to either directly the dynptr
* data at the requested offset or to the buffer if unable to obtain a direct
* data pointer to (example: the requested slice is to the paged area of an skb
@@ -2267,7 +2275,7 @@ __bpf_kfunc void *bpf_dynptr_slice(const struct bpf_dynptr_kern *ptr, u32 offset
* direct pointer)
*/
__bpf_kfunc void *bpf_dynptr_slice_rdwr(const struct bpf_dynptr_kern *ptr, u32 offset,
- void *buffer, u32 buffer__szk)
+ void *buffer__opt, u32 buffer__szk)
{
if (!ptr->data || bpf_dynptr_is_rdonly(ptr))
return NULL;
@@ -2294,7 +2302,7 @@ __bpf_kfunc void *bpf_dynptr_slice_rdwr(const struct bpf_dynptr_kern *ptr, u32 o
* will be copied out into the buffer and the user will need to call
* bpf_dynptr_write() to commit changes.
*/
- return bpf_dynptr_slice(ptr, offset, buffer, buffer__szk);
+ return bpf_dynptr_slice(ptr, offset, buffer__opt, buffer__szk);
}

__bpf_kfunc void *bpf_cast_to_kern_ctx(void *obj)
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index ebc638bfed87..fd959824469d 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -9387,6 +9387,19 @@ static bool is_kfunc_arg_const_mem_size(const struct btf *btf,
return __kfunc_param_match_suffix(btf, arg, "__szk");
}

+static bool is_kfunc_arg_optional(const struct btf *btf,
+ const struct btf_param *arg,
+ const struct bpf_reg_state *reg)
+{
+ const struct btf_type *t;
+
+ t = btf_type_skip_modifiers(btf, arg->type, NULL);
+ if (!btf_type_is_ptr(t) || reg->type != SCALAR_VALUE || reg->umax_value > 0)
+ return false;
+
+ return __kfunc_param_match_suffix(btf, arg, "__opt");
+}
+
static bool is_kfunc_arg_constant(const struct btf *btf, const struct btf_param *arg)
{
return __kfunc_param_match_suffix(btf, arg, "__k");
@@ -10453,10 +10466,16 @@ static int check_kfunc_args(struct bpf_verifier_env *env, struct bpf_kfunc_call_
break;
case KF_ARG_PTR_TO_MEM_SIZE:
{
+ struct bpf_reg_state *buff_reg = &regs[regno];
+ const struct btf_param *buff_arg = &args[i];
struct bpf_reg_state *size_reg = &regs[regno + 1];
const struct btf_param *size_arg = &args[i + 1];

ret = check_kfunc_mem_size_reg(env, size_reg, regno + 1);
+ if (ret < 0 && is_kfunc_arg_optional(meta->btf, buff_arg, buff_reg)) {
+ verbose(env, "error was %d", ret);
+ ret = 0;
+ }
if (ret < 0) {
verbose(env, "arg#%d arg#%d memory, len pair leads to invalid memory access\n", i, i + 1);
return ret;
--
2.40.0.634.g4ca3ef3211-goog

2023-04-18 01:42:32

by Daniel Rosenberg

[permalink] [raw]
Subject: [RFC PATCH v3 07/37] fuse-bpf: Prepare for fuse-bpf patch

This moves some functions and structs around to make the following patch
easier to read.

Signed-off-by: Daniel Rosenberg <[email protected]>
Signed-off-by: Paul Lawrence <[email protected]>
---
fs/fuse/dir.c | 30 ------------------------------
fs/fuse/fuse_i.h | 35 +++++++++++++++++++++++++++++++++++
fs/fuse/inode.c | 44 ++++++++++++++++++++++----------------------
3 files changed, 57 insertions(+), 52 deletions(-)

diff --git a/fs/fuse/dir.c b/fs/fuse/dir.c
index 35bc174f9ba2..55dd6e8b2e43 100644
--- a/fs/fuse/dir.c
+++ b/fs/fuse/dir.c
@@ -46,10 +46,6 @@ static inline u64 fuse_dentry_time(const struct dentry *entry)
}

#else
-union fuse_dentry {
- u64 time;
- struct rcu_head rcu;
-};

static inline void __fuse_dentry_settime(struct dentry *dentry, u64 time)
{
@@ -83,27 +79,6 @@ static void fuse_dentry_settime(struct dentry *dentry, u64 time)
__fuse_dentry_settime(dentry, time);
}

-/*
- * FUSE caches dentries and attributes with separate timeout. The
- * time in jiffies until the dentry/attributes are valid is stored in
- * dentry->d_fsdata and fuse_inode->i_time respectively.
- */
-
-/*
- * Calculate the time in jiffies until a dentry/attributes are valid
- */
-static u64 time_to_jiffies(u64 sec, u32 nsec)
-{
- if (sec || nsec) {
- struct timespec64 ts = {
- sec,
- min_t(u32, nsec, NSEC_PER_SEC - 1)
- };
-
- return get_jiffies_64() + timespec64_to_jiffies(&ts);
- } else
- return 0;
-}

/*
* Set dentry and possibly attribute timeouts from the lookup/mk*
@@ -115,11 +90,6 @@ void fuse_change_entry_timeout(struct dentry *entry, struct fuse_entry_out *o)
time_to_jiffies(o->entry_valid, o->entry_valid_nsec));
}

-static u64 attr_timeout(struct fuse_attr_out *o)
-{
- return time_to_jiffies(o->attr_valid, o->attr_valid_nsec);
-}
-
u64 entry_attr_timeout(struct fuse_entry_out *o)
{
return time_to_jiffies(o->attr_valid, o->attr_valid_nsec);
diff --git a/fs/fuse/fuse_i.h b/fs/fuse/fuse_i.h
index 9b7fc7d3c7f1..01ca8bb87b4f 100644
--- a/fs/fuse/fuse_i.h
+++ b/fs/fuse/fuse_i.h
@@ -63,6 +63,14 @@ struct fuse_forget_link {
struct fuse_forget_link *next;
};

+/** FUSE specific dentry data */
+#if BITS_PER_LONG < 64
+union fuse_dentry {
+ u64 time;
+ struct rcu_head rcu;
+};
+#endif
+
/** FUSE inode */
struct fuse_inode {
/** Inode data */
@@ -1324,4 +1332,31 @@ struct fuse_file *fuse_file_open(struct fuse_mount *fm, u64 nodeid,
void fuse_file_release(struct inode *inode, struct fuse_file *ff,
unsigned int open_flags, fl_owner_t id, bool isdir);

+/*
+ * FUSE caches dentries and attributes with separate timeout. The
+ * time in jiffies until the dentry/attributes are valid is stored in
+ * dentry->d_fsdata and fuse_inode->i_time respectively.
+ */
+
+/*
+ * Calculate the time in jiffies until a dentry/attributes are valid
+ */
+static inline u64 time_to_jiffies(u64 sec, u32 nsec)
+{
+ if (sec || nsec) {
+ struct timespec64 ts = {
+ sec,
+ min_t(u32, nsec, NSEC_PER_SEC - 1)
+ };
+
+ return get_jiffies_64() + timespec64_to_jiffies(&ts);
+ } else
+ return 0;
+}
+
+static inline u64 attr_timeout(struct fuse_attr_out *o)
+{
+ return time_to_jiffies(o->attr_valid, o->attr_valid_nsec);
+}
+
#endif /* _FS_FUSE_I_H */
diff --git a/fs/fuse/inode.c b/fs/fuse/inode.c
index d66070af145d..a824ca100047 100644
--- a/fs/fuse/inode.c
+++ b/fs/fuse/inode.c
@@ -162,6 +162,28 @@ static ino_t fuse_squash_ino(u64 ino64)
return ino;
}

+static void fuse_fill_attr_from_inode(struct fuse_attr *attr,
+ const struct fuse_inode *fi)
+{
+ *attr = (struct fuse_attr){
+ .ino = fi->inode.i_ino,
+ .size = fi->inode.i_size,
+ .blocks = fi->inode.i_blocks,
+ .atime = fi->inode.i_atime.tv_sec,
+ .mtime = fi->inode.i_mtime.tv_sec,
+ .ctime = fi->inode.i_ctime.tv_sec,
+ .atimensec = fi->inode.i_atime.tv_nsec,
+ .mtimensec = fi->inode.i_mtime.tv_nsec,
+ .ctimensec = fi->inode.i_ctime.tv_nsec,
+ .mode = fi->inode.i_mode,
+ .nlink = fi->inode.i_nlink,
+ .uid = fi->inode.i_uid.val,
+ .gid = fi->inode.i_gid.val,
+ .rdev = fi->inode.i_rdev,
+ .blksize = 1u << fi->inode.i_blkbits,
+ };
+}
+
void fuse_change_attributes_common(struct inode *inode, struct fuse_attr *attr,
u64 attr_valid, u32 cache_mask)
{
@@ -1394,28 +1416,6 @@ void fuse_dev_free(struct fuse_dev *fud)
}
EXPORT_SYMBOL_GPL(fuse_dev_free);

-static void fuse_fill_attr_from_inode(struct fuse_attr *attr,
- const struct fuse_inode *fi)
-{
- *attr = (struct fuse_attr){
- .ino = fi->inode.i_ino,
- .size = fi->inode.i_size,
- .blocks = fi->inode.i_blocks,
- .atime = fi->inode.i_atime.tv_sec,
- .mtime = fi->inode.i_mtime.tv_sec,
- .ctime = fi->inode.i_ctime.tv_sec,
- .atimensec = fi->inode.i_atime.tv_nsec,
- .mtimensec = fi->inode.i_mtime.tv_nsec,
- .ctimensec = fi->inode.i_ctime.tv_nsec,
- .mode = fi->inode.i_mode,
- .nlink = fi->inode.i_nlink,
- .uid = fi->inode.i_uid.val,
- .gid = fi->inode.i_gid.val,
- .rdev = fi->inode.i_rdev,
- .blksize = 1u << fi->inode.i_blkbits,
- };
-}
-
static void fuse_sb_defaults(struct super_block *sb)
{
sb->s_magic = FUSE_SUPER_MAGIC;
--
2.40.0.634.g4ca3ef3211-goog

2023-04-18 01:42:46

by Daniel Rosenberg

[permalink] [raw]
Subject: [RFC PATCH v3 08/37] fuse: Add fuse-bpf, a stacked fs extension for FUSE

Fuse-bpf provides a short circuit path for Fuse implementations that act
as a stacked filesystem. For cases that are directly unchanged,
operations are passed directly to the backing filesystem. Small
adjustments can be handled by bpf prefilters or postfilters, with the
option to fall back to userspace as needed.

Fuse implementations may supply backing node information, as well as bpf
programs via an optional add on to the lookup structure.

This has been split over the next set of patches for readability.
Clusters of fuse ops have been split into their own patches, as well as
the actual bpf calls and userspace calls for filters.

Signed-off-by: Daniel Rosenberg <[email protected]>
Signed-off-by: Paul Lawrence <[email protected]>
Signed-off-by: Alessio Balsini <[email protected]>
---
fs/fuse/Kconfig | 8 +
fs/fuse/Makefile | 1 +
fs/fuse/backing.c | 419 ++++++++++++++++++++++++++++++++++++++++++++++
fs/fuse/dev.c | 41 ++++-
fs/fuse/dir.c | 187 +++++++++++++++++----
fs/fuse/file.c | 25 ++-
fs/fuse/fuse_i.h | 99 ++++++++++-
fs/fuse/inode.c | 189 +++++++++++++++++----
fs/fuse/ioctl.c | 2 +-
9 files changed, 888 insertions(+), 83 deletions(-)
create mode 100644 fs/fuse/backing.c

diff --git a/fs/fuse/Kconfig b/fs/fuse/Kconfig
index 038ed0b9aaa5..3a64fa73e591 100644
--- a/fs/fuse/Kconfig
+++ b/fs/fuse/Kconfig
@@ -52,3 +52,11 @@ config FUSE_DAX

If you want to allow mounting a Virtio Filesystem with the "dax"
option, answer Y.
+
+config FUSE_BPF
+ bool "Adds BPF to fuse"
+ depends on FUSE_FS
+ depends on BPF
+ help
+ Extends FUSE by adding BPF to prefilter calls and potentially pass to a
+ backing file system
diff --git a/fs/fuse/Makefile b/fs/fuse/Makefile
index 0c48b35c058d..a0853c439db2 100644
--- a/fs/fuse/Makefile
+++ b/fs/fuse/Makefile
@@ -9,5 +9,6 @@ obj-$(CONFIG_VIRTIO_FS) += virtiofs.o

fuse-y := dev.o dir.o file.o inode.o control.o xattr.o acl.o readdir.o ioctl.o
fuse-$(CONFIG_FUSE_DAX) += dax.o
+fuse-$(CONFIG_FUSE_BPF) += backing.o

virtiofs-y := virtio_fs.o
diff --git a/fs/fuse/backing.c b/fs/fuse/backing.c
new file mode 100644
index 000000000000..3d895957b5ce
--- /dev/null
+++ b/fs/fuse/backing.c
@@ -0,0 +1,419 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * FUSE-BPF: Filesystem in Userspace with BPF
+ * Copyright (c) 2021 Google LLC
+ */
+
+#include "fuse_i.h"
+
+#include <linux/bpf_fuse.h>
+#include <linux/fdtable.h>
+#include <linux/file.h>
+#include <linux/fs_stack.h>
+#include <linux/namei.h>
+
+/*
+ * expression statement to wrap the backing filter logic
+ * struct inode *inode: inode with bpf and backing inode
+ * typedef io: (typically complex) type whose components fuse_args can point to.
+ * An instance of this type is created locally and passed to initialize
+ * void initialize_in(struct bpf_fuse_args *fa, io *in_out, args...): function that sets
+ * up fa and io based on args
+ * void initialize_out(struct bpf_fuse_args *fa, io *in_out, args...): function that sets
+ * up fa and io based on args
+ * int backing(struct fuse_bpf_args_internal *fa, args...): function that actually performs
+ * the backing io operation
+ * void *finalize(struct fuse_bpf_args *, args...): function that performs any final
+ * work needed to commit the backing io
+ */
+#define bpf_fuse_backing(inode, io, out, \
+ initialize_in, initialize_out, \
+ backing, finalize, args...) \
+({ \
+ struct fuse_inode *fuse_inode = get_fuse_inode(inode); \
+ struct bpf_fuse_args fa = { 0 }; \
+ bool initialized = false; \
+ bool handled = false; \
+ ssize_t res; \
+ io feo = { 0 }; \
+ int error = 0; \
+ \
+ do { \
+ if (!inode || !fuse_inode->backing_inode) \
+ break; \
+ \
+ handled = true; \
+ error = initialize_in(&fa, &feo, args); \
+ if (error) \
+ break; \
+ \
+ error = initialize_out(&fa, &feo, args); \
+ if (error) \
+ break; \
+ \
+ initialized = true; \
+ \
+ error = backing(&fa, out, args); \
+ if (error < 0) \
+ fa.info.error_in = error; \
+ \
+ } while (false); \
+ \
+ if (initialized && handled) { \
+ res = finalize(&fa, out, args); \
+ if (res) \
+ error = res; \
+ } \
+ \
+ *out = error ? _Generic((*out), \
+ default : \
+ error, \
+ struct dentry * : \
+ ERR_PTR(error), \
+ const char * : \
+ ERR_PTR(error) \
+ ) : (*out); \
+ handled; \
+})
+
+static void fuse_get_backing_path(struct file *file, struct path *path)
+{
+ path_get(&file->f_path);
+ *path = file->f_path;
+}
+
+static bool has_file(int type)
+{
+ return type == FUSE_ENTRY_BACKING;
+}
+
+/*
+ * The optional fuse bpf entry lists the backing file for a particular
+ * lookup. These are inherited by default.
+ *
+ * In the future, we may support multiple bpfs, and multiple backing files for
+ * the bpf to choose between.
+ *
+ * Currently, the expected format is possibly a bpf program, then the backing
+ * file. Changing only the bpf is valid, though meaningless if there isn't an
+ * inherited backing file.
+ *
+ * Support for the bpf program will be added in a later patch
+ *
+ */
+int parse_fuse_bpf_entry(struct fuse_bpf_entry *fbe, int num)
+{
+ struct fuse_bpf_entry_out *fbeo;
+ struct file *file;
+ bool has_backing = false;
+ int num_entries;
+ int err = -EINVAL;
+ int i;
+
+ if (num > 0)
+ num_entries = num;
+ else
+ num_entries = FUSE_BPF_MAX_ENTRIES;
+
+ for (i = 0; i < num_entries; i++) {
+ file = NULL;
+ fbeo = &fbe->out[i];
+
+ /* reserved for future use */
+ if (fbeo->unused != 0)
+ goto out_err;
+
+ if (has_file(fbeo->entry_type)) {
+ file = fget(fbeo->fd);
+ if (!file) {
+ err = -EBADF;
+ goto out_err;
+ }
+ }
+
+ switch (fbeo->entry_type) {
+ case 0:
+ if (num == -1)
+ num_entries = i;
+ else
+ goto out_err;
+ break;
+ case FUSE_ENTRY_REMOVE_BACKING:
+ if (fbe->backing_action)
+ goto out_err;
+ fbe->backing_action = FUSE_BPF_REMOVE;
+ break;
+ case FUSE_ENTRY_BACKING:
+ if (fbe->backing_action)
+ goto out_err;
+ fuse_get_backing_path(file, &fbe->backing_path);
+ fbe->backing_action = FUSE_BPF_SET;
+ has_backing = true;
+ break;
+ default:
+ err = -EINVAL;
+ goto out_err;
+ }
+ if (has_file(fbeo->entry_type)) {
+ fput(file);
+ file = NULL;
+ }
+ }
+
+ fbe->is_used = num_entries > 0;
+
+ return 0;
+out_err:
+ if (file)
+ fput(file);
+ if (has_backing)
+ path_put_init(&fbe->backing_path);
+ return err;
+}
+
+static void fuse_stat_to_attr(struct fuse_conn *fc, struct inode *inode,
+ struct kstat *stat, struct fuse_attr *attr)
+{
+ unsigned int blkbits;
+
+ /* see the comment in fuse_change_attributes() */
+ if (fc->writeback_cache && S_ISREG(inode->i_mode)) {
+ stat->size = i_size_read(inode);
+ stat->mtime.tv_sec = inode->i_mtime.tv_sec;
+ stat->mtime.tv_nsec = inode->i_mtime.tv_nsec;
+ stat->ctime.tv_sec = inode->i_ctime.tv_sec;
+ stat->ctime.tv_nsec = inode->i_ctime.tv_nsec;
+ }
+
+ attr->ino = stat->ino;
+ attr->mode = (inode->i_mode & S_IFMT) | (stat->mode & 07777);
+ attr->nlink = stat->nlink;
+ attr->uid = from_kuid(fc->user_ns, stat->uid);
+ attr->gid = from_kgid(fc->user_ns, stat->gid);
+ attr->atime = stat->atime.tv_sec;
+ attr->atimensec = stat->atime.tv_nsec;
+ attr->mtime = stat->mtime.tv_sec;
+ attr->mtimensec = stat->mtime.tv_nsec;
+ attr->ctime = stat->ctime.tv_sec;
+ attr->ctimensec = stat->ctime.tv_nsec;
+ attr->size = stat->size;
+ attr->blocks = stat->blocks;
+
+ if (stat->blksize != 0)
+ blkbits = ilog2(stat->blksize);
+ else
+ blkbits = inode->i_sb->s_blocksize_bits;
+
+ attr->blksize = 1 << blkbits;
+}
+
+/*******************************************************************************
+ * Directory operations after here *
+ ******************************************************************************/
+
+struct fuse_lookup_args {
+ struct fuse_buffer name;
+ struct fuse_entry_out out;
+ struct fuse_bpf_entry entries_storage;
+ struct fuse_buffer bpf_entries;
+};
+
+static int fuse_lookup_initialize_in(struct bpf_fuse_args *fa, struct fuse_lookup_args *args,
+ struct inode *dir, struct dentry *entry, unsigned int flags)
+{
+ *args = (struct fuse_lookup_args) {
+ .name = (struct fuse_buffer) {
+ .data = (void *) entry->d_name.name,
+ .size = entry->d_name.len + 1,
+ .max_size = NAME_MAX + 1,
+ .flags = BPF_FUSE_VARIABLE_SIZE | BPF_FUSE_MUST_ALLOCATE,
+ },
+ };
+ *fa = (struct bpf_fuse_args) {
+ .info = (struct bpf_fuse_meta_info) {
+ .nodeid = get_fuse_inode(dir)->nodeid,
+ .opcode = FUSE_LOOKUP,
+ },
+ .in_numargs = 1,
+ .in_args[0] = (struct bpf_fuse_arg) {
+ .is_buffer = true,
+ .buffer = &args->name,
+ },
+ };
+
+ return 0;
+}
+
+static int fuse_lookup_initialize_out(struct bpf_fuse_args *fa, struct fuse_lookup_args *args,
+ struct inode *dir, struct dentry *entry, unsigned int flags)
+{
+ args->bpf_entries = (struct fuse_buffer) {
+ .data = args->entries_storage.out,
+ .size = 0,
+ .alloc_size = sizeof(args->entries_storage.out),
+ .max_size = sizeof(args->entries_storage.out),
+ .flags = BPF_FUSE_VARIABLE_SIZE,
+ },
+
+ fa->out_numargs = 2;
+ fa->flags = FUSE_BPF_OUT_ARGVAR | FUSE_BPF_IS_LOOKUP;
+ fa->out_args[0] = (struct bpf_fuse_arg) {
+ .size = sizeof(args->out),
+ .value = &args->out,
+ };
+ fa->out_args[1] = (struct bpf_fuse_arg) {
+ .is_buffer = true,
+ .buffer = &args->bpf_entries,
+ };
+
+ return 0;
+}
+
+static int fuse_lookup_backing(struct bpf_fuse_args *fa, struct dentry **out, struct inode *dir,
+ struct dentry *entry, unsigned int flags)
+{
+ struct fuse_dentry *fuse_entry = get_fuse_dentry(entry);
+ struct fuse_dentry *dir_fuse_entry = get_fuse_dentry(entry->d_parent);
+ struct dentry *dir_backing_entry = dir_fuse_entry->backing_path.dentry;
+ struct inode *dir_backing_inode = dir_backing_entry->d_inode;
+ struct fuse_entry_out *feo = (void *)fa->out_args[0].value;
+ struct dentry *backing_entry;
+ const char *name;
+ struct kstat stat;
+ int len;
+ int err;
+
+ /* TODO this will not handle lookups over mount points */
+ inode_lock_nested(dir_backing_inode, I_MUTEX_PARENT);
+ if (fa->in_args[0].buffer->flags & BPF_FUSE_MODIFIED) {
+ name = (char *)fa->in_args[0].buffer->data;
+ len = strnlen(name, fa->in_args[0].buffer->size);
+ } else {
+ name = entry->d_name.name;
+ len = entry->d_name.len;
+ }
+ backing_entry = lookup_one_len(name, dir_backing_entry, len);
+ inode_unlock(dir_backing_inode);
+
+ if (IS_ERR(backing_entry))
+ return PTR_ERR(backing_entry);
+
+ fuse_entry->backing_path = (struct path) {
+ .dentry = backing_entry,
+ .mnt = mntget(dir_fuse_entry->backing_path.mnt),
+ };
+
+ if (d_is_negative(backing_entry)) {
+ fa->info.error_in = -ENOENT;
+ return 0;
+ }
+
+ err = vfs_getattr(&fuse_entry->backing_path, &stat,
+ STATX_BASIC_STATS, 0);
+ if (err) {
+ path_put_init(&fuse_entry->backing_path);
+ return err;
+ }
+
+ fuse_stat_to_attr(get_fuse_conn(dir),
+ backing_entry->d_inode, &stat, &feo->attr);
+ return 0;
+}
+
+int fuse_handle_backing(struct fuse_bpf_entry *fbe, struct path *backing_path)
+{
+ switch (fbe->backing_action) {
+ case FUSE_BPF_UNCHANGED:
+ /* backing inode/path are added in fuse_lookup_backing */
+ break;
+
+ case FUSE_BPF_REMOVE:
+ path_put_init(backing_path);
+ break;
+
+ case FUSE_BPF_SET: {
+ if (!fbe->backing_path.dentry)
+ return -EINVAL;
+
+ path_put(backing_path);
+ *backing_path = fbe->backing_path;
+ fbe->backing_path.dentry = NULL;
+ fbe->backing_path.mnt = NULL;
+
+ break;
+ }
+
+ default:
+ return -EINVAL;
+ }
+
+ return 0;
+}
+
+static int fuse_lookup_finalize(struct bpf_fuse_args *fa, struct dentry **out,
+ struct inode *dir, struct dentry *entry, unsigned int flags)
+{
+ struct fuse_dentry *fd;
+ struct dentry *backing_dentry;
+ struct inode *inode, *backing_inode;
+ struct inode *d_inode = entry->d_inode;
+ struct fuse_entry_out *feo = fa->out_args[0].value;
+ struct fuse_bpf_entry_out *febo = fa->out_args[1].buffer->data;
+ struct fuse_bpf_entry *fbe = container_of(febo, struct fuse_bpf_entry, out[0]);
+ int error = -1;
+ u64 target_nodeid = 0;
+
+ parse_fuse_bpf_entry(fbe, -1);
+ fd = get_fuse_dentry(entry);
+ if (!fd)
+ return -EIO;
+ error = fuse_handle_backing(fbe, &fd->backing_path);
+ if (error)
+ return error;
+ backing_dentry = fd->backing_path.dentry;
+ if (!backing_dentry)
+ return -ENOENT;
+ backing_inode = backing_dentry->d_inode;
+ if (!backing_inode) {
+ *out = 0;
+ return 0;
+ }
+
+ if (d_inode)
+ target_nodeid = get_fuse_inode(d_inode)->nodeid;
+
+ inode = fuse_iget_backing(dir->i_sb, target_nodeid, backing_inode);
+
+ if (IS_ERR(inode))
+ return PTR_ERR(inode);
+
+ get_fuse_inode(inode)->nodeid = feo->nodeid;
+
+ *out = d_splice_alias(inode, entry);
+ return 0;
+}
+
+int fuse_bpf_lookup(struct dentry **out, struct inode *dir, struct dentry *entry, unsigned int flags)
+{
+ return bpf_fuse_backing(dir, struct fuse_lookup_args, out,
+ fuse_lookup_initialize_in, fuse_lookup_initialize_out,
+ fuse_lookup_backing, fuse_lookup_finalize,
+ dir, entry, flags);
+}
+
+int fuse_revalidate_backing(struct dentry *entry, unsigned int flags)
+{
+ struct fuse_dentry *fuse_dentry = get_fuse_dentry(entry);
+ struct dentry *backing_entry = fuse_dentry->backing_path.dentry;
+
+ spin_lock(&backing_entry->d_lock);
+ if (d_unhashed(backing_entry)) {
+ spin_unlock(&backing_entry->d_lock);
+ return 0;
+ }
+ spin_unlock(&backing_entry->d_lock);
+
+ if (unlikely(backing_entry->d_flags & DCACHE_OP_REVALIDATE))
+ return backing_entry->d_op->d_revalidate(backing_entry, flags);
+ return 1;
+}
diff --git a/fs/fuse/dev.c b/fs/fuse/dev.c
index eb4f88e3dc97..a3029824c24f 100644
--- a/fs/fuse/dev.c
+++ b/fs/fuse/dev.c
@@ -238,6 +238,11 @@ void fuse_queue_forget(struct fuse_conn *fc, struct fuse_forget_link *forget,
{
struct fuse_iqueue *fiq = &fc->iq;

+ if (nodeid == 0) {
+ kfree(forget);
+ return;
+ }
+
forget->forget_one.nodeid = nodeid;
forget->forget_one.nlookup = nlookup;

@@ -1009,10 +1014,38 @@ static int fuse_copy_one(struct fuse_copy_state *cs, void *val, unsigned size)
return 0;
}

+/* Copy the fuse-bpf lookup args and verify them */
+#ifdef CONFIG_FUSE_BPF
+static int fuse_copy_lookup(struct fuse_copy_state *cs, void *val, unsigned size)
+{
+ struct fuse_bpf_entry_out *fbeo = (struct fuse_bpf_entry_out *)val;
+ struct fuse_bpf_entry *feb = container_of(fbeo, struct fuse_bpf_entry, out[0]);
+ int num_entries = size / sizeof(*fbeo);
+ int err;
+
+ if (size && size % sizeof(*fbeo) != 0)
+ return -EINVAL;
+
+ if (num_entries > FUSE_BPF_MAX_ENTRIES)
+ return -EINVAL;
+ err = fuse_copy_one(cs, val, size);
+ if (err)
+ return err;
+ if (size)
+ err = parse_fuse_bpf_entry(feb, num_entries);
+ return err;
+}
+#else
+static int fuse_copy_lookup(struct fuse_copy_state *cs, void *val, unsigned size)
+{
+ return fuse_copy_one(cs, val, size);
+}
+#endif
+
/* Copy request arguments to/from userspace buffer */
static int fuse_copy_args(struct fuse_copy_state *cs, unsigned numargs,
unsigned argpages, struct fuse_arg *args,
- int zeroing)
+ int zeroing, unsigned is_lookup)
{
int err = 0;
unsigned i;
@@ -1021,6 +1054,8 @@ static int fuse_copy_args(struct fuse_copy_state *cs, unsigned numargs,
struct fuse_arg *arg = &args[i];
if (i == numargs - 1 && argpages)
err = fuse_copy_pages(cs, arg->size, zeroing);
+ else if (i == numargs - 1 && is_lookup)
+ err = fuse_copy_lookup(cs, arg->value, arg->size);
else
err = fuse_copy_one(cs, arg->value, arg->size);
}
@@ -1298,7 +1333,7 @@ static ssize_t fuse_dev_do_read(struct fuse_dev *fud, struct file *file,
err = fuse_copy_one(cs, &req->in.h, sizeof(req->in.h));
if (!err)
err = fuse_copy_args(cs, args->in_numargs, args->in_pages,
- (struct fuse_arg *) args->in_args, 0);
+ (struct fuse_arg *) args->in_args, 0, 0);
fuse_copy_finish(cs);
spin_lock(&fpq->lock);
clear_bit(FR_LOCKED, &req->flags);
@@ -1837,7 +1872,7 @@ static int copy_out_args(struct fuse_copy_state *cs, struct fuse_args *args,
lastarg->size -= diffsize;
}
return fuse_copy_args(cs, args->out_numargs, args->out_pages,
- args->out_args, args->page_zeroing);
+ args->out_args, args->page_zeroing, args->is_lookup);
}

/*
diff --git a/fs/fuse/dir.c b/fs/fuse/dir.c
index 55dd6e8b2e43..73ebe3498fb9 100644
--- a/fs/fuse/dir.c
+++ b/fs/fuse/dir.c
@@ -34,7 +34,7 @@ static void fuse_advise_use_readdirplus(struct inode *dir)
set_bit(FUSE_I_ADVISE_RDPLUS, &fi->state);
}

-#if BITS_PER_LONG >= 64
+#if BITS_PER_LONG >= 64 && !defined(CONFIG_FUSE_BPF)
static inline void __fuse_dentry_settime(struct dentry *entry, u64 time)
{
entry->d_fsdata = (void *) time;
@@ -49,12 +49,12 @@ static inline u64 fuse_dentry_time(const struct dentry *entry)

static inline void __fuse_dentry_settime(struct dentry *dentry, u64 time)
{
- ((union fuse_dentry *) dentry->d_fsdata)->time = time;
+ ((struct fuse_dentry *) dentry->d_fsdata)->time = time;
}

static inline u64 fuse_dentry_time(const struct dentry *entry)
{
- return ((union fuse_dentry *) entry->d_fsdata)->time;
+ return ((struct fuse_dentry *) entry->d_fsdata)->time;
}
#endif

@@ -79,6 +79,17 @@ static void fuse_dentry_settime(struct dentry *dentry, u64 time)
__fuse_dentry_settime(dentry, time);
}

+void fuse_init_dentry_root(struct dentry *root, struct file *backing_dir)
+{
+#ifdef CONFIG_FUSE_BPF
+ struct fuse_dentry *fuse_dentry = root->d_fsdata;
+
+ if (backing_dir) {
+ fuse_dentry->backing_path = backing_dir->f_path;
+ path_get(&fuse_dentry->backing_path);
+ }
+#endif
+}

/*
* Set dentry and possibly attribute timeouts from the lookup/mk*
@@ -150,7 +161,8 @@ static void fuse_invalidate_entry(struct dentry *entry)

static void fuse_lookup_init(struct fuse_conn *fc, struct fuse_args *args,
u64 nodeid, const struct qstr *name,
- struct fuse_entry_out *outarg)
+ struct fuse_entry_out *outarg,
+ struct fuse_bpf_entry_out *bpf_outarg)
{
memset(outarg, 0, sizeof(struct fuse_entry_out));
args->opcode = FUSE_LOOKUP;
@@ -158,10 +170,43 @@ static void fuse_lookup_init(struct fuse_conn *fc, struct fuse_args *args,
args->in_numargs = 1;
args->in_args[0].size = name->len + 1;
args->in_args[0].value = name->name;
- args->out_numargs = 1;
+ args->out_argvar = true;
+ args->out_numargs = 2;
args->out_args[0].size = sizeof(struct fuse_entry_out);
args->out_args[0].value = outarg;
+ args->out_args[1].size = sizeof(struct fuse_bpf_entry_out) * FUSE_BPF_MAX_ENTRIES;
+ args->out_args[1].value = bpf_outarg;
+ args->is_lookup = 1;
+}
+
+#ifdef CONFIG_FUSE_BPF
+static bool backing_data_changed(struct fuse_inode *fi, struct dentry *entry,
+ struct fuse_bpf_entry *bpf_arg)
+{
+ struct path new_backing_path;
+ struct inode *new_backing_inode;
+ int err;
+ bool ret = true;
+
+ if (!entry)
+ return false;
+
+ get_fuse_backing_path(entry, &new_backing_path);
+
+ err = fuse_handle_backing(bpf_arg, &new_backing_path);
+ new_backing_inode = d_inode(new_backing_path.dentry);
+
+ if (err)
+ goto put_inode;
+
+ ret = (fi->backing_inode != new_backing_inode ||
+ !path_equal(&get_fuse_dentry(entry)->backing_path, &new_backing_path));
+
+put_inode:
+ path_put(&new_backing_path);
+ return ret;
}
+#endif

/*
* Check whether the dentry is still valid
@@ -183,9 +228,23 @@ static int fuse_dentry_revalidate(struct dentry *entry, unsigned int flags)
inode = d_inode_rcu(entry);
if (inode && fuse_is_bad(inode))
goto invalid;
- else if (time_before64(fuse_dentry_time(entry), get_jiffies_64()) ||
+
+#ifdef CONFIG_FUSE_BPF
+ /* TODO: Do we need bpf support for revalidate?
+ * If the lower filesystem says the entry is invalid, FUSE probably shouldn't
+ * try to fix that without going through the normal lookup path...
+ */
+ if (get_fuse_dentry(entry)->backing_path.dentry) {
+ ret = fuse_revalidate_backing(entry, flags);
+ if (ret <= 0) {
+ goto out;
+ }
+ }
+#endif
+ if (time_before64(fuse_dentry_time(entry), get_jiffies_64()) ||
(flags & (LOOKUP_EXCL | LOOKUP_REVAL | LOOKUP_RENAME_TARGET))) {
struct fuse_entry_out outarg;
+ struct fuse_bpf_entry bpf_arg;
FUSE_ARGS(args);
struct fuse_forget_link *forget;
u64 attr_version;
@@ -197,27 +256,44 @@ static int fuse_dentry_revalidate(struct dentry *entry, unsigned int flags)
ret = -ECHILD;
if (flags & LOOKUP_RCU)
goto out;
-
fm = get_fuse_mount(inode);

+ parent = dget_parent(entry);
+
+#ifdef CONFIG_FUSE_BPF
+ /* TODO: Once we're handling timeouts for backing inodes, do a
+ * bpf based lookup_revalidate here.
+ */
+ if (get_fuse_inode(parent->d_inode)->backing_inode) {
+ dput(parent);
+ ret = 1;
+ goto out;
+ }
+#endif
forget = fuse_alloc_forget();
ret = -ENOMEM;
- if (!forget)
+ if (!forget) {
+ dput(parent);
goto out;
+ }

attr_version = fuse_get_attr_version(fm->fc);

- parent = dget_parent(entry);
fuse_lookup_init(fm->fc, &args, get_node_id(d_inode(parent)),
- &entry->d_name, &outarg);
+ &entry->d_name, &outarg, bpf_arg.out);
ret = fuse_simple_request(fm, &args);
dput(parent);
+
/* Zero nodeid is same as -ENOENT */
if (!ret && !outarg.nodeid)
ret = -ENOENT;
- if (!ret) {
+ if (!ret || bpf_arg.is_used) {
fi = get_fuse_inode(inode);
if (outarg.nodeid != get_node_id(inode) ||
+#ifdef CONFIG_FUSE_BPF
+ (bpf_arg.is_used &&
+ backing_data_changed(fi, entry, &bpf_arg)) ||
+#endif
(bool) IS_AUTOMOUNT(inode) != (bool) (outarg.attr.flags & FUSE_ATTR_SUBMOUNT)) {
fuse_queue_forget(fm->fc, forget,
outarg.nodeid, 1);
@@ -259,17 +335,20 @@ static int fuse_dentry_revalidate(struct dentry *entry, unsigned int flags)
goto out;
}

-#if BITS_PER_LONG < 64
+#if BITS_PER_LONG < 64 || defined(CONFIG_FUSE_BPF)
static int fuse_dentry_init(struct dentry *dentry)
{
- dentry->d_fsdata = kzalloc(sizeof(union fuse_dentry),
+ dentry->d_fsdata = kzalloc(sizeof(struct fuse_dentry),
GFP_KERNEL_ACCOUNT | __GFP_RECLAIMABLE);

return dentry->d_fsdata ? 0 : -ENOMEM;
}
static void fuse_dentry_release(struct dentry *dentry)
{
- union fuse_dentry *fd = dentry->d_fsdata;
+ struct fuse_dentry *fd = dentry->d_fsdata;
+
+ if (fd && fd->backing_path.dentry)
+ path_put(&fd->backing_path);

kfree_rcu(fd, rcu);
}
@@ -310,7 +389,7 @@ static struct vfsmount *fuse_dentry_automount(struct path *path)
const struct dentry_operations fuse_dentry_operations = {
.d_revalidate = fuse_dentry_revalidate,
.d_delete = fuse_dentry_delete,
-#if BITS_PER_LONG < 64
+#if BITS_PER_LONG < 64 || defined(CONFIG_FUSE_BPF)
.d_init = fuse_dentry_init,
.d_release = fuse_dentry_release,
#endif
@@ -318,7 +397,7 @@ const struct dentry_operations fuse_dentry_operations = {
};

const struct dentry_operations fuse_root_dentry_operations = {
-#if BITS_PER_LONG < 64
+#if BITS_PER_LONG < 64 || defined(CONFIG_FUSE_BPF)
.d_init = fuse_dentry_init,
.d_release = fuse_dentry_release,
#endif
@@ -336,11 +415,13 @@ bool fuse_invalid_attr(struct fuse_attr *attr)
attr->size > LLONG_MAX;
}

-int fuse_lookup_name(struct super_block *sb, u64 nodeid, const struct qstr *name,
- struct fuse_entry_out *outarg, struct inode **inode)
+int fuse_lookup_name(struct super_block *sb, u64 nodeid,
+ const struct qstr *name, struct fuse_entry_out *outarg,
+ struct dentry *entry, struct inode **inode)
{
struct fuse_mount *fm = get_fuse_mount_super(sb);
FUSE_ARGS(args);
+ struct fuse_bpf_entry bpf_arg = { 0 };
struct fuse_forget_link *forget;
u64 attr_version;
int err;
@@ -358,23 +439,56 @@ int fuse_lookup_name(struct super_block *sb, u64 nodeid, const struct qstr *name

attr_version = fuse_get_attr_version(fm->fc);

- fuse_lookup_init(fm->fc, &args, nodeid, name, outarg);
+ fuse_lookup_init(fm->fc, &args, nodeid, name, outarg, bpf_arg.out);
err = fuse_simple_request(fm, &args);
- /* Zero nodeid is same as -ENOENT, but with valid timeout */
- if (err || !outarg->nodeid)
- goto out_put_forget;

- err = -EIO;
- if (!outarg->nodeid)
- goto out_put_forget;
- if (fuse_invalid_attr(&outarg->attr))
- goto out_put_forget;
-
- *inode = fuse_iget(sb, outarg->nodeid, outarg->generation,
- &outarg->attr, entry_attr_timeout(outarg),
- attr_version);
+#ifdef CONFIG_FUSE_BPF
+ if (bpf_arg.is_used) {
+ /* TODO Make sure this handles invalid handles */
+ struct path *backing_path;
+ struct inode *backing_inode;
+
+ err = -ENOENT;
+ if (!entry)
+ goto out_queue_forget;
+
+ err = -EINVAL;
+ backing_path = &bpf_arg.backing_path;
+ if (!backing_path->dentry)
+ goto out_queue_forget;
+
+ err = fuse_handle_backing(&bpf_arg,
+ &get_fuse_dentry(entry)->backing_path);
+ if (err)
+ goto out_queue_forget;
+
+ backing_inode = d_inode(get_fuse_dentry(entry)->backing_path.dentry);
+ *inode = fuse_iget_backing(sb, outarg->nodeid, backing_inode);
+ if (!*inode)
+ goto out_queue_forget;
+ } else
+#endif
+ {
+ /* Zero nodeid is same as -ENOENT, but with valid timeout */
+ if (err || !outarg->nodeid)
+ goto out_put_forget;
+
+ err = -EIO;
+ if (!outarg->nodeid)
+ goto out_put_forget;
+ if (fuse_invalid_attr(&outarg->attr))
+ goto out_put_forget;
+
+ *inode = fuse_iget(sb, outarg->nodeid, outarg->generation,
+ &outarg->attr, entry_attr_timeout(outarg),
+ attr_version);
+ }
+
err = -ENOMEM;
- if (!*inode) {
+#ifdef CONFIG_FUSE_BPF
+out_queue_forget:
+#endif
+ if (!*inode && outarg->nodeid) {
fuse_queue_forget(fm->fc, forget, outarg->nodeid, 1);
goto out;
}
@@ -399,9 +513,12 @@ static struct dentry *fuse_lookup(struct inode *dir, struct dentry *entry,
if (fuse_is_bad(dir))
return ERR_PTR(-EIO);

+ if (fuse_bpf_lookup(&newent, dir, entry, flags))
+ return newent;
+
locked = fuse_lock_inode(dir);
err = fuse_lookup_name(dir->i_sb, get_node_id(dir), &entry->d_name,
- &outarg, &inode);
+ &outarg, entry, &inode);
fuse_unlock_inode(dir, locked);
if (err == -ENOENT) {
outarg_valid = false;
@@ -1370,6 +1487,7 @@ static int fuse_permission(struct mnt_idmap *idmap,
struct fuse_conn *fc = get_fuse_conn(inode);
bool refreshed = false;
int err = 0;
+ struct fuse_inode *fi = get_fuse_inode(inode);

if (fuse_is_bad(inode))
return -EIO;
@@ -1382,7 +1500,6 @@ static int fuse_permission(struct mnt_idmap *idmap,
*/
if (fc->default_permissions ||
((mask & MAY_EXEC) && S_ISREG(inode->i_mode))) {
- struct fuse_inode *fi = get_fuse_inode(inode);
u32 perm_mask = STATX_MODE | STATX_UID | STATX_GID;

if (perm_mask & READ_ONCE(fi->inval_mask) ||
@@ -1559,7 +1676,7 @@ static long fuse_dir_compat_ioctl(struct file *file, unsigned int cmd,
FUSE_IOCTL_COMPAT | FUSE_IOCTL_DIR);
}

-static bool update_mtime(unsigned ivalid, bool trust_local_mtime)
+static inline bool update_mtime(unsigned int ivalid, bool trust_local_mtime)
{
/* Always update if mtime is explicitly set */
if (ivalid & ATTR_MTIME_SET)
diff --git a/fs/fuse/file.c b/fs/fuse/file.c
index de37a3a06a71..25fb49f0a9f7 100644
--- a/fs/fuse/file.c
+++ b/fs/fuse/file.c
@@ -8,6 +8,7 @@

#include "fuse_i.h"

+#include <linux/filter.h>
#include <linux/pagemap.h>
#include <linux/slab.h>
#include <linux/kernel.h>
@@ -127,13 +128,18 @@ static void fuse_file_put(struct fuse_file *ff, bool sync, bool isdir)
}

struct fuse_file *fuse_file_open(struct fuse_mount *fm, u64 nodeid,
- unsigned int open_flags, bool isdir)
+ unsigned int open_flags, bool isdir, struct file *file)
{
struct fuse_conn *fc = fm->fc;
struct fuse_file *ff;
int opcode = isdir ? FUSE_OPENDIR : FUSE_OPEN;

- ff = fuse_file_alloc(fm);
+ if (file && file->private_data) {
+ ff = file->private_data;
+ file->private_data = NULL;
+ } else {
+ ff = fuse_file_alloc(fm);
+ }
if (!ff)
return ERR_PTR(-ENOMEM);

@@ -171,7 +177,7 @@ struct fuse_file *fuse_file_open(struct fuse_mount *fm, u64 nodeid,
int fuse_do_open(struct fuse_mount *fm, u64 nodeid, struct file *file,
bool isdir)
{
- struct fuse_file *ff = fuse_file_open(fm, nodeid, file->f_flags, isdir);
+ struct fuse_file *ff = fuse_file_open(fm, nodeid, file->f_flags, isdir, file);

if (!IS_ERR(ff))
file->private_data = ff;
@@ -1948,6 +1954,19 @@ int fuse_write_inode(struct inode *inode, struct writeback_control *wbc)
*/
WARN_ON(wbc->for_reclaim);

+ /**
+ * TODO - fully understand why this is necessary
+ *
+ * With fuse-bpf, fsstress fails if rename is enabled without this
+ *
+ * We are getting writes here on directory inodes, which do not have an
+ * initialized file list so crash.
+ *
+ * The question is why we are getting those writes
+ */
+ if (!S_ISREG(inode->i_mode))
+ return 0;
+
ff = __fuse_write_file_get(fi);
err = fuse_flush_times(inode, ff);
if (ff)
diff --git a/fs/fuse/fuse_i.h b/fs/fuse/fuse_i.h
index 01ca8bb87b4f..c24878f4a89f 100644
--- a/fs/fuse/fuse_i.h
+++ b/fs/fuse/fuse_i.h
@@ -16,6 +16,8 @@
#include <linux/fuse.h>
#include <linux/fs.h>
#include <linux/mount.h>
+#include <linux/pagemap.h>
+#include <linux/statfs.h>
#include <linux/wait.h>
#include <linux/list.h>
#include <linux/spinlock.h>
@@ -31,6 +33,7 @@
#include <linux/pid_namespace.h>
#include <linux/refcount.h>
#include <linux/user_namespace.h>
+#include <linux/magic.h>

/** Default max number of pages that can be used in a single read request */
#define FUSE_DEFAULT_MAX_PAGES_PER_REQ 32
@@ -64,11 +67,35 @@ struct fuse_forget_link {
};

/** FUSE specific dentry data */
-#if BITS_PER_LONG < 64
-union fuse_dentry {
- u64 time;
- struct rcu_head rcu;
+#if BITS_PER_LONG < 64 || defined(CONFIG_FUSE_BPF)
+struct fuse_dentry {
+ union {
+ u64 time;
+ struct rcu_head rcu;
+ };
+ struct path backing_path;
};
+
+static inline struct fuse_dentry *get_fuse_dentry(const struct dentry *entry)
+{
+ return entry->d_fsdata;
+}
+#endif
+
+#ifdef CONFIG_FUSE_BPF
+static inline void get_fuse_backing_path(const struct dentry *d,
+ struct path *path)
+{
+ struct fuse_dentry *di = get_fuse_dentry(d);
+
+ if (!di) {
+ *path = (struct path) { .mnt = 0, .dentry = 0 };
+ return;
+ }
+
+ *path = di->backing_path;
+ path_get(path);
+}
#endif

/** FUSE inode */
@@ -76,6 +103,14 @@ struct fuse_inode {
/** Inode data */
struct inode inode;

+#ifdef CONFIG_FUSE_BPF
+ /**
+ * Backing inode, if this inode is from a backing file system.
+ * If this is set, nodeid is 0.
+ */
+ struct inode *backing_inode;
+#endif
+
/** Unique ID, which identifies the inode between userspace
* and kernel */
u64 nodeid;
@@ -226,6 +261,14 @@ struct fuse_file {

} readdir;

+#ifdef CONFIG_FUSE_BPF
+ /**
+ * TODO: Reconcile with passthrough file
+ * backing file when in bpf mode
+ */
+ struct file *backing_file;
+#endif
+
/** RB node to be linked on fuse_conn->polled_files */
struct rb_node polled_node;

@@ -257,6 +300,7 @@ struct fuse_page_desc {
struct fuse_args {
uint64_t nodeid;
uint32_t opcode;
+ uint32_t error_in; // May need adjustments???
uint8_t in_numargs;
uint8_t out_numargs;
uint8_t ext_idx;
@@ -271,6 +315,7 @@ struct fuse_args {
bool page_replace:1;
bool may_block:1;
bool is_ext:1;
+ bool is_lookup:1;
struct fuse_in_arg in_args[3];
struct fuse_arg out_args[2];
void (*end)(struct fuse_mount *fm, struct fuse_args *args, int error);
@@ -524,6 +569,7 @@ struct fuse_fs_context {
unsigned int max_read;
unsigned int blksize;
const char *subtype;
+ struct file *root_dir;

/* DAX device, may be NULL */
struct dax_device *dax_dev;
@@ -970,12 +1016,16 @@ extern const struct dentry_operations fuse_root_dentry_operations;
/**
* Get a filled in inode
*/
+struct inode *fuse_iget_backing(struct super_block *sb,
+ u64 nodeid,
+ struct inode *backing_inode);
struct inode *fuse_iget(struct super_block *sb, u64 nodeid,
int generation, struct fuse_attr *attr,
u64 attr_valid, u64 attr_version);

int fuse_lookup_name(struct super_block *sb, u64 nodeid, const struct qstr *name,
- struct fuse_entry_out *outarg, struct inode **inode);
+ struct fuse_entry_out *outarg,
+ struct dentry *entry, struct inode **inode);

/**
* Send FORGET command
@@ -1120,6 +1170,7 @@ void fuse_invalidate_entry_cache(struct dentry *entry);
void fuse_invalidate_atime(struct inode *inode);

u64 entry_attr_timeout(struct fuse_entry_out *o);
+void fuse_init_dentry_root(struct dentry *root, struct file *backing_dir);
void fuse_change_entry_timeout(struct dentry *entry, struct fuse_entry_out *o);

/**
@@ -1328,10 +1379,46 @@ int fuse_fileattr_set(struct mnt_idmap *idmap,
/* file.c */

struct fuse_file *fuse_file_open(struct fuse_mount *fm, u64 nodeid,
- unsigned int open_flags, bool isdir);
+ unsigned int open_flags, bool isdir,
+ struct file *file);
void fuse_file_release(struct inode *inode, struct fuse_file *ff,
unsigned int open_flags, fl_owner_t id, bool isdir);

+/* backing.c */
+
+enum fuse_bpf_set {
+ FUSE_BPF_UNCHANGED = 0,
+ FUSE_BPF_SET,
+ FUSE_BPF_REMOVE,
+};
+
+struct fuse_bpf_entry {
+ struct fuse_bpf_entry_out out[FUSE_BPF_MAX_ENTRIES];
+
+ enum fuse_bpf_set backing_action;
+ struct path backing_path;
+ bool is_used;
+};
+
+int parse_fuse_bpf_entry(struct fuse_bpf_entry *fbe, int num_entries);
+
+#ifdef CONFIG_FUSE_BPF
+
+int fuse_bpf_lookup(struct dentry **out, struct inode *dir, struct dentry *entry, unsigned int flags);
+
+#else
+
+static inline int fuse_bpf_lookup(struct dentry **out, struct inode *dir, struct dentry *entry, unsigned int flags)
+{
+ return 0;
+}
+
+#endif // CONFIG_FUSE_BPF
+
+int fuse_handle_backing(struct fuse_bpf_entry *feb, struct path *backing_path);
+
+int fuse_revalidate_backing(struct dentry *entry, unsigned int flags);
+
/*
* FUSE caches dentries and attributes with separate timeout. The
* time in jiffies until the dentry/attributes are valid is stored in
diff --git a/fs/fuse/inode.c b/fs/fuse/inode.c
index a824ca100047..b71e8758fab5 100644
--- a/fs/fuse/inode.c
+++ b/fs/fuse/inode.c
@@ -78,6 +78,9 @@ static struct inode *fuse_alloc_inode(struct super_block *sb)

fi->i_time = 0;
fi->inval_mask = 0;
+#ifdef CONFIG_FUSE_BPF
+ fi->backing_inode = NULL;
+#endif
fi->nodeid = 0;
fi->nlookup = 0;
fi->attr_version = 0;
@@ -120,6 +123,10 @@ static void fuse_evict_inode(struct inode *inode)
/* Will write inode on close/munmap and in all other dirtiers */
WARN_ON(inode->i_state & I_DIRTY_INODE);

+#ifdef CONFIG_FUSE_BPF
+ iput(fi->backing_inode);
+#endif
+
truncate_inode_pages_final(&inode->i_data);
clear_inode(inode);
if (inode->i_sb->s_flags & SB_ACTIVE) {
@@ -163,24 +170,24 @@ static ino_t fuse_squash_ino(u64 ino64)
}

static void fuse_fill_attr_from_inode(struct fuse_attr *attr,
- const struct fuse_inode *fi)
+ const struct inode *inode)
{
*attr = (struct fuse_attr){
- .ino = fi->inode.i_ino,
- .size = fi->inode.i_size,
- .blocks = fi->inode.i_blocks,
- .atime = fi->inode.i_atime.tv_sec,
- .mtime = fi->inode.i_mtime.tv_sec,
- .ctime = fi->inode.i_ctime.tv_sec,
- .atimensec = fi->inode.i_atime.tv_nsec,
- .mtimensec = fi->inode.i_mtime.tv_nsec,
- .ctimensec = fi->inode.i_ctime.tv_nsec,
- .mode = fi->inode.i_mode,
- .nlink = fi->inode.i_nlink,
- .uid = fi->inode.i_uid.val,
- .gid = fi->inode.i_gid.val,
- .rdev = fi->inode.i_rdev,
- .blksize = 1u << fi->inode.i_blkbits,
+ .ino = inode->i_ino,
+ .size = inode->i_size,
+ .blocks = inode->i_blocks,
+ .atime = inode->i_atime.tv_sec,
+ .mtime = inode->i_mtime.tv_sec,
+ .ctime = inode->i_ctime.tv_sec,
+ .atimensec = inode->i_atime.tv_nsec,
+ .mtimensec = inode->i_mtime.tv_nsec,
+ .ctimensec = inode->i_ctime.tv_nsec,
+ .mode = inode->i_mode,
+ .nlink = inode->i_nlink,
+ .uid = inode->i_uid.val,
+ .gid = inode->i_gid.val,
+ .rdev = inode->i_rdev,
+ .blksize = 1u << inode->i_blkbits,
};
}

@@ -352,8 +359,7 @@ static void fuse_init_inode(struct inode *inode, struct fuse_attr *attr,
else if (S_ISCHR(inode->i_mode) || S_ISBLK(inode->i_mode) ||
S_ISFIFO(inode->i_mode) || S_ISSOCK(inode->i_mode)) {
fuse_init_common(inode);
- init_special_inode(inode, inode->i_mode,
- new_decode_dev(attr->rdev));
+ init_special_inode(inode, inode->i_mode, attr->rdev);
} else
BUG();
/*
@@ -364,22 +370,100 @@ static void fuse_init_inode(struct inode *inode, struct fuse_attr *attr,
inode->i_acl = inode->i_default_acl = ACL_DONT_CACHE;
}

+struct fuse_inode_identifier {
+ u64 nodeid;
+ struct inode *backing_inode;
+};
+
static int fuse_inode_eq(struct inode *inode, void *_nodeidp)
{
- u64 nodeid = *(u64 *) _nodeidp;
- if (get_node_id(inode) == nodeid)
- return 1;
- else
- return 0;
+ struct fuse_inode_identifier *fii =
+ (struct fuse_inode_identifier *) _nodeidp;
+ struct fuse_inode *fi = get_fuse_inode(inode);
+
+ return fii->nodeid == fi->nodeid;
+}
+
+static int fuse_inode_backing_eq(struct inode *inode, void *_nodeidp)
+{
+ struct fuse_inode_identifier *fii =
+ (struct fuse_inode_identifier *) _nodeidp;
+ struct fuse_inode *fi = get_fuse_inode(inode);
+
+ return fii->nodeid == fi->nodeid
+#ifdef CONFIG_FUSE_BPF
+ && fii->backing_inode == fi->backing_inode
+#endif
+ ;
}

static int fuse_inode_set(struct inode *inode, void *_nodeidp)
{
- u64 nodeid = *(u64 *) _nodeidp;
- get_fuse_inode(inode)->nodeid = nodeid;
+ struct fuse_inode_identifier *fii =
+ (struct fuse_inode_identifier *) _nodeidp;
+ struct fuse_inode *fi = get_fuse_inode(inode);
+
+ fi->nodeid = fii->nodeid;
+
+ return 0;
+}
+
+static int fuse_inode_backing_set(struct inode *inode, void *_nodeidp)
+{
+ struct fuse_inode_identifier *fii =
+ (struct fuse_inode_identifier *) _nodeidp;
+ struct fuse_inode *fi = get_fuse_inode(inode);
+
+ fi->nodeid = fii->nodeid;
+#ifdef CONFIG_FUSE_BPF
+ BUG_ON(fi->backing_inode != NULL);
+ fi->backing_inode = fii->backing_inode;
+ if (fi->backing_inode)
+ ihold(fi->backing_inode);
+#endif
+
return 0;
}

+struct inode *fuse_iget_backing(struct super_block *sb, u64 nodeid,
+ struct inode *backing_inode)
+{
+ struct inode *inode;
+ struct fuse_inode *fi;
+ struct fuse_conn *fc = get_fuse_conn_super(sb);
+ struct fuse_inode_identifier fii = {
+ .nodeid = nodeid,
+ .backing_inode = backing_inode,
+ };
+ struct fuse_attr attr;
+ unsigned long hash = (unsigned long) backing_inode;
+
+ if (nodeid)
+ hash = nodeid;
+
+ fuse_fill_attr_from_inode(&attr, backing_inode);
+ inode = iget5_locked(sb, hash, fuse_inode_backing_eq,
+ fuse_inode_backing_set, &fii);
+ if (!inode)
+ return NULL;
+
+ if ((inode->i_state & I_NEW)) {
+ inode->i_flags |= S_NOATIME;
+ if (!fc->writeback_cache)
+ inode->i_flags |= S_NOCMTIME;
+ fuse_init_common(inode);
+ unlock_new_inode(inode);
+ }
+
+ fi = get_fuse_inode(inode);
+ fuse_init_inode(inode, &attr, fc);
+ spin_lock(&fi->lock);
+ fi->nlookup++;
+ spin_unlock(&fi->lock);
+
+ return inode;
+}
+
struct inode *fuse_iget(struct super_block *sb, u64 nodeid,
int generation, struct fuse_attr *attr,
u64 attr_valid, u64 attr_version)
@@ -387,6 +471,9 @@ struct inode *fuse_iget(struct super_block *sb, u64 nodeid,
struct inode *inode;
struct fuse_inode *fi;
struct fuse_conn *fc = get_fuse_conn_super(sb);
+ struct fuse_inode_identifier fii = {
+ .nodeid = nodeid,
+ };

/*
* Auto mount points get their node id from the submount root, which is
@@ -408,7 +495,7 @@ struct inode *fuse_iget(struct super_block *sb, u64 nodeid,
}

retry:
- inode = iget5_locked(sb, nodeid, fuse_inode_eq, fuse_inode_set, &nodeid);
+ inode = iget5_locked(sb, nodeid, fuse_inode_eq, fuse_inode_set, &fii);
if (!inode)
return NULL;

@@ -440,13 +527,16 @@ struct inode *fuse_ilookup(struct fuse_conn *fc, u64 nodeid,
{
struct fuse_mount *fm_iter;
struct inode *inode;
+ struct fuse_inode_identifier fii = {
+ .nodeid = nodeid,
+ };

WARN_ON(!rwsem_is_locked(&fc->killsb));
list_for_each_entry(fm_iter, &fc->mounts, fc_entry) {
if (!fm_iter->sb)
continue;

- inode = ilookup5(fm_iter->sb, nodeid, fuse_inode_eq, &nodeid);
+ inode = ilookup5(fm_iter->sb, nodeid, fuse_inode_eq, &fii);
if (inode) {
if (fm)
*fm = fm_iter;
@@ -676,6 +766,7 @@ enum {
OPT_ALLOW_OTHER,
OPT_MAX_READ,
OPT_BLKSIZE,
+ OPT_ROOT_DIR,
OPT_ERR
};

@@ -690,6 +781,7 @@ static const struct fs_parameter_spec fuse_fs_parameters[] = {
fsparam_u32 ("max_read", OPT_MAX_READ),
fsparam_u32 ("blksize", OPT_BLKSIZE),
fsparam_string ("subtype", OPT_SUBTYPE),
+ fsparam_u32 ("root_dir", OPT_ROOT_DIR),
{}
};

@@ -773,6 +865,12 @@ static int fuse_parse_param(struct fs_context *fsc, struct fs_parameter *param)
ctx->blksize = result.uint_32;
break;

+ case OPT_ROOT_DIR:
+ ctx->root_dir = fget(result.uint_32);
+ if (!ctx->root_dir)
+ return invalfc(fsc, "Unable to open root directory");
+ break;
+
default:
return -EINVAL;
}
@@ -785,6 +883,8 @@ static void fuse_free_fsc(struct fs_context *fsc)
struct fuse_fs_context *ctx = fsc->fs_private;

if (ctx) {
+ if (ctx->root_dir)
+ fput(ctx->root_dir);
kfree(ctx->subtype);
kfree(ctx);
}
@@ -912,15 +1012,29 @@ struct fuse_conn *fuse_conn_get(struct fuse_conn *fc)
}
EXPORT_SYMBOL_GPL(fuse_conn_get);

-static struct inode *fuse_get_root_inode(struct super_block *sb, unsigned mode)
+static struct inode *fuse_get_root_inode(struct super_block *sb,
+ unsigned int mode,
+ struct file *backing_fd)
{
struct fuse_attr attr;
- memset(&attr, 0, sizeof(attr));
+ struct inode *inode;

+ memset(&attr, 0, sizeof(attr));
attr.mode = mode;
attr.ino = FUSE_ROOT_ID;
attr.nlink = 1;
- return fuse_iget(sb, 1, 0, &attr, 0, 0);
+ inode = fuse_iget(sb, 1, 0, &attr, 0, 0);
+ if (!inode)
+ return NULL;
+
+#ifdef CONFIG_FUSE_BPF
+ if (backing_fd) {
+ get_fuse_inode(inode)->backing_inode = backing_fd->f_inode;
+ ihold(backing_fd->f_inode);
+ }
+#endif
+
+ return inode;
}

struct fuse_inode_handle {
@@ -935,11 +1049,14 @@ static struct dentry *fuse_get_dentry(struct super_block *sb,
struct inode *inode;
struct dentry *entry;
int err = -ESTALE;
+ struct fuse_inode_identifier fii = {
+ .nodeid = handle->nodeid,
+ };

if (handle->nodeid == 0)
goto out_err;

- inode = ilookup5(sb, handle->nodeid, fuse_inode_eq, &handle->nodeid);
+ inode = ilookup5(sb, handle->nodeid, fuse_inode_eq, &fii);
if (!inode) {
struct fuse_entry_out outarg;
const struct qstr name = QSTR_INIT(".", 1);
@@ -948,7 +1065,7 @@ static struct dentry *fuse_get_dentry(struct super_block *sb,
goto out_err;

err = fuse_lookup_name(sb, handle->nodeid, &name, &outarg,
- &inode);
+ NULL, &inode);
if (err && err != -ENOENT)
goto out_err;
if (err || !inode) {
@@ -1042,13 +1159,14 @@ static struct dentry *fuse_get_parent(struct dentry *child)
struct inode *inode;
struct dentry *parent;
struct fuse_entry_out outarg;
+ const struct qstr name = QSTR_INIT("..", 2);
int err;

if (!fc->export_support)
return ERR_PTR(-ESTALE);

err = fuse_lookup_name(child_inode->i_sb, get_node_id(child_inode),
- &dotdot_name, &outarg, &inode);
+ &name, &outarg, NULL, &inode);
if (err) {
if (err == -ENOENT)
return ERR_PTR(-ESTALE);
@@ -1452,7 +1570,7 @@ static int fuse_fill_super_submount(struct super_block *sb,
if (parent_sb->s_subtype && !sb->s_subtype)
return -ENOMEM;

- fuse_fill_attr_from_inode(&root_attr, parent_fi);
+ fuse_fill_attr_from_inode(&root_attr, &parent_fi->inode);
root = fuse_iget(sb, parent_fi->nodeid, 0, &root_attr, 0, 0);
/*
* This inode is just a duplicate, so it is not looked up and
@@ -1581,11 +1699,12 @@ int fuse_fill_super_common(struct super_block *sb, struct fuse_fs_context *ctx)
fc->no_force_umount = ctx->no_force_umount;

err = -ENOMEM;
- root = fuse_get_root_inode(sb, ctx->rootmode);
+ root = fuse_get_root_inode(sb, ctx->rootmode, ctx->root_dir);
sb->s_d_op = &fuse_root_dentry_operations;
root_dentry = d_make_root(root);
if (!root_dentry)
goto err_dev_free;
+ fuse_init_dentry_root(root_dentry, ctx->root_dir);
/* Root dentry doesn't have .d_revalidate */
sb->s_d_op = &fuse_dentry_operations;

diff --git a/fs/fuse/ioctl.c b/fs/fuse/ioctl.c
index 8e01bfdfc430..3542d992bde6 100644
--- a/fs/fuse/ioctl.c
+++ b/fs/fuse/ioctl.c
@@ -428,7 +428,7 @@ static struct fuse_file *fuse_priv_ioctl_prepare(struct inode *inode)
if (!S_ISREG(inode->i_mode) && !isdir)
return ERR_PTR(-ENOTTY);

- return fuse_file_open(fm, get_node_id(inode), O_RDONLY, isdir);
+ return fuse_file_open(fm, get_node_id(inode), O_RDONLY, isdir, NULL);
}

static void fuse_priv_ioctl_cleanup(struct inode *inode, struct fuse_file *ff)
--
2.40.0.634.g4ca3ef3211-goog

2023-04-18 01:43:06

by Daniel Rosenberg

[permalink] [raw]
Subject: [RFC PATCH v3 09/37] fuse-bpf: Add ioctl interface for /dev/fuse

This introduces an alternative method of responding to fuse requests.
Lookups supplying a backing fd or bpf will need to call through the
ioctl to ensure there can be no attempts to fool priveledged processes
into inadvertantly performing other actions.

Signed-off-by: Daniel Rosenberg <[email protected]>
---
fs/fuse/dev.c | 56 ++++++++++++++++++++++++++++++++-------
fs/fuse/fuse_i.h | 1 +
include/uapi/linux/fuse.h | 1 +
3 files changed, 48 insertions(+), 10 deletions(-)

diff --git a/fs/fuse/dev.c b/fs/fuse/dev.c
index a3029824c24f..ad7d9d1e6da5 100644
--- a/fs/fuse/dev.c
+++ b/fs/fuse/dev.c
@@ -1016,18 +1016,19 @@ static int fuse_copy_one(struct fuse_copy_state *cs, void *val, unsigned size)

/* Copy the fuse-bpf lookup args and verify them */
#ifdef CONFIG_FUSE_BPF
-static int fuse_copy_lookup(struct fuse_copy_state *cs, void *val, unsigned size)
+static int fuse_copy_lookup(struct fuse_copy_state *cs, unsigned via_ioctl, void *val, unsigned size)
{
struct fuse_bpf_entry_out *fbeo = (struct fuse_bpf_entry_out *)val;
struct fuse_bpf_entry *feb = container_of(fbeo, struct fuse_bpf_entry, out[0]);
int num_entries = size / sizeof(*fbeo);
int err;

- if (size && size % sizeof(*fbeo) != 0)
+ if (size && (size % sizeof(*fbeo) != 0 || !via_ioctl))
return -EINVAL;

if (num_entries > FUSE_BPF_MAX_ENTRIES)
return -EINVAL;
+
err = fuse_copy_one(cs, val, size);
if (err)
return err;
@@ -1036,7 +1037,7 @@ static int fuse_copy_lookup(struct fuse_copy_state *cs, void *val, unsigned size
return err;
}
#else
-static int fuse_copy_lookup(struct fuse_copy_state *cs, void *val, unsigned size)
+static int fuse_copy_lookup(struct fuse_copy_state *cs, unsigned via_ioctl, void *val, unsigned size)
{
return fuse_copy_one(cs, val, size);
}
@@ -1045,7 +1046,7 @@ static int fuse_copy_lookup(struct fuse_copy_state *cs, void *val, unsigned size
/* Copy request arguments to/from userspace buffer */
static int fuse_copy_args(struct fuse_copy_state *cs, unsigned numargs,
unsigned argpages, struct fuse_arg *args,
- int zeroing, unsigned is_lookup)
+ int zeroing, unsigned is_lookup, unsigned via_ioct)
{
int err = 0;
unsigned i;
@@ -1055,7 +1056,7 @@ static int fuse_copy_args(struct fuse_copy_state *cs, unsigned numargs,
if (i == numargs - 1 && argpages)
err = fuse_copy_pages(cs, arg->size, zeroing);
else if (i == numargs - 1 && is_lookup)
- err = fuse_copy_lookup(cs, arg->value, arg->size);
+ err = fuse_copy_lookup(cs, via_ioct, arg->value, arg->size);
else
err = fuse_copy_one(cs, arg->value, arg->size);
}
@@ -1333,7 +1334,7 @@ static ssize_t fuse_dev_do_read(struct fuse_dev *fud, struct file *file,
err = fuse_copy_one(cs, &req->in.h, sizeof(req->in.h));
if (!err)
err = fuse_copy_args(cs, args->in_numargs, args->in_pages,
- (struct fuse_arg *) args->in_args, 0, 0);
+ (struct fuse_arg *) args->in_args, 0, 0, 0);
fuse_copy_finish(cs);
spin_lock(&fpq->lock);
clear_bit(FR_LOCKED, &req->flags);
@@ -1872,7 +1873,8 @@ static int copy_out_args(struct fuse_copy_state *cs, struct fuse_args *args,
lastarg->size -= diffsize;
}
return fuse_copy_args(cs, args->out_numargs, args->out_pages,
- args->out_args, args->page_zeroing, args->is_lookup);
+ args->out_args, args->page_zeroing, args->is_lookup,
+ args->via_ioctl);
}

/*
@@ -1882,7 +1884,7 @@ static int copy_out_args(struct fuse_copy_state *cs, struct fuse_args *args,
* it from the list and copy the rest of the buffer to the request.
* The request is finished by calling fuse_request_end().
*/
-static ssize_t fuse_dev_do_write(struct fuse_dev *fud,
+static ssize_t fuse_dev_do_write(struct fuse_dev *fud, bool from_ioctl,
struct fuse_copy_state *cs, size_t nbytes)
{
int err;
@@ -1954,6 +1956,7 @@ static ssize_t fuse_dev_do_write(struct fuse_dev *fud,
if (!req->args->page_replace)
cs->move_pages = 0;

+ req->args->via_ioctl = from_ioctl;
if (oh.error)
err = nbytes != sizeof(oh) ? -EINVAL : 0;
else
@@ -1992,7 +1995,7 @@ static ssize_t fuse_dev_write(struct kiocb *iocb, struct iov_iter *from)

fuse_copy_init(&cs, 0, from);

- return fuse_dev_do_write(fud, &cs, iov_iter_count(from));
+ return fuse_dev_do_write(fud, false, &cs, iov_iter_count(from));
}

static ssize_t fuse_dev_splice_write(struct pipe_inode_info *pipe,
@@ -2073,7 +2076,7 @@ static ssize_t fuse_dev_splice_write(struct pipe_inode_info *pipe,
if (flags & SPLICE_F_MOVE)
cs.move_pages = 1;

- ret = fuse_dev_do_write(fud, &cs, len);
+ ret = fuse_dev_do_write(fud, false, &cs, len);

pipe_lock(pipe);
out_free:
@@ -2286,6 +2289,33 @@ static int fuse_device_clone(struct fuse_conn *fc, struct file *new)
return 0;
}

+// Provides an alternate means to respond to a fuse request
+static int fuse_handle_ioc_response(struct fuse_dev *dev, void *buff, uint32_t size)
+{
+ struct fuse_copy_state cs;
+ struct iovec *iov = NULL;
+ struct iov_iter iter;
+ int res;
+
+ if (size > PAGE_SIZE)
+ return -EINVAL;
+ iov = (struct iovec *) __get_free_page(GFP_KERNEL);
+ if (!iov)
+ return -ENOMEM;
+
+ iov->iov_base = buff;
+ iov->iov_len = size;
+
+ iov_iter_init(&iter, READ, iov, 1, size);
+ fuse_copy_init(&cs, 0, &iter);
+
+
+ res = fuse_dev_do_write(dev, true, &cs, size);
+ free_page((unsigned long) iov);
+
+ return res;
+}
+
static long fuse_dev_ioctl(struct file *file, unsigned int cmd,
unsigned long arg)
{
@@ -2318,6 +2348,12 @@ static long fuse_dev_ioctl(struct file *file, unsigned int cmd,
}
break;
default:
+ if (_IOC_TYPE(cmd) == FUSE_DEV_IOC_MAGIC
+ && _IOC_NR(cmd) == _IOC_NR(FUSE_DEV_IOC_BPF_RESPONSE(0))
+ && _IOC_DIR(cmd) == _IOC_WRITE) {
+ res = fuse_handle_ioc_response(fuse_get_dev(file), (void *) arg, _IOC_SIZE(cmd));
+ break;
+ }
res = -ENOTTY;
break;
}
diff --git a/fs/fuse/fuse_i.h b/fs/fuse/fuse_i.h
index c24878f4a89f..39a9fdf2a752 100644
--- a/fs/fuse/fuse_i.h
+++ b/fs/fuse/fuse_i.h
@@ -316,6 +316,7 @@ struct fuse_args {
bool may_block:1;
bool is_ext:1;
bool is_lookup:1;
+ bool via_ioctl:1;
struct fuse_in_arg in_args[3];
struct fuse_arg out_args[2];
void (*end)(struct fuse_mount *fm, struct fuse_args *args, int error);
diff --git a/include/uapi/linux/fuse.h b/include/uapi/linux/fuse.h
index 04d96f34e9a1..3ad725a3e968 100644
--- a/include/uapi/linux/fuse.h
+++ b/include/uapi/linux/fuse.h
@@ -1012,6 +1012,7 @@ struct fuse_notify_retrieve_in {
/* Device ioctls: */
#define FUSE_DEV_IOC_MAGIC 229
#define FUSE_DEV_IOC_CLONE _IOR(FUSE_DEV_IOC_MAGIC, 0, uint32_t)
+#define FUSE_DEV_IOC_BPF_RESPONSE(N) _IOW(FUSE_DEV_IOC_MAGIC, 125, char[N])

struct fuse_lseek_in {
uint64_t fh;
--
2.40.0.634.g4ca3ef3211-goog

2023-04-18 01:43:11

by Daniel Rosenberg

[permalink] [raw]
Subject: [RFC PATCH v3 10/37] fuse-bpf: Don't support export_operations

In the future, we may choose to support these, but it poses some
challenges. In order to create a disconnected dentry/inode, we'll need
to encode the mountpoint and bpf into the file_handle, which means we'd
need a stable representation of them. This also won't hold up to cases
where the bpf is not stateless. One possibility is registering bpf
programs and mounts in a specific order, so they can be assigned
consistent ids we can use in the file_handle. We can defer to the lower
filesystem for the lower inode's representation in the file_handle.

Signed-off-by: Daniel Rosenberg <[email protected]>
---
fs/fuse/inode.c | 8 ++++++++
1 file changed, 8 insertions(+)

diff --git a/fs/fuse/inode.c b/fs/fuse/inode.c
index b71e8758fab5..fe80984f099a 100644
--- a/fs/fuse/inode.c
+++ b/fs/fuse/inode.c
@@ -1107,6 +1107,14 @@ static int fuse_encode_fh(struct inode *inode, u32 *fh, int *max_len,
nodeid = get_fuse_inode(inode)->nodeid;
generation = inode->i_generation;

+#ifdef CONFIG_FUSE_BPF
+ /* TODO: Does it make sense to support this in some cases? */
+ if (!nodeid && get_fuse_inode(inode)->backing_inode) {
+ *max_len = 0;
+ return FILEID_INVALID;
+ }
+#endif
+
fh[0] = (u32)(nodeid >> 32);
fh[1] = (u32)(nodeid & 0xffffffff);
fh[2] = generation;
--
2.40.0.634.g4ca3ef3211-goog

2023-04-18 01:43:12

by Daniel Rosenberg

[permalink] [raw]
Subject: [RFC PATCH v3 11/37] fuse-bpf: Add support for access

This adds backing support for FUSE_ACCESS

Signed-off-by: Daniel Rosenberg <[email protected]>
Signed-off-by: Paul Lawrence <[email protected]>
---
fs/fuse/backing.c | 47 +++++++++++++++++++++++++++++++++++++++++++++++
fs/fuse/dir.c | 6 ++++++
fs/fuse/fuse_i.h | 6 ++++++
3 files changed, 59 insertions(+)

diff --git a/fs/fuse/backing.c b/fs/fuse/backing.c
index 3d895957b5ce..e42622584037 100644
--- a/fs/fuse/backing.c
+++ b/fs/fuse/backing.c
@@ -417,3 +417,50 @@ int fuse_revalidate_backing(struct dentry *entry, unsigned int flags)
return backing_entry->d_op->d_revalidate(backing_entry, flags);
return 1;
}
+
+static int fuse_access_initialize_in(struct bpf_fuse_args *fa, struct fuse_access_in *in,
+ struct inode *inode, int mask)
+{
+ *in = (struct fuse_access_in) {
+ .mask = mask,
+ };
+
+ *fa = (struct bpf_fuse_args) {
+ .info = (struct bpf_fuse_meta_info) {
+ .opcode = FUSE_ACCESS,
+ .nodeid = get_node_id(inode),
+ },
+ .in_numargs = 1,
+ .in_args[0].size = sizeof(*in),
+ .in_args[0].value = in,
+ };
+
+ return 0;
+}
+
+static int fuse_access_initialize_out(struct bpf_fuse_args *fa, struct fuse_access_in *in,
+ struct inode *inode, int mask)
+{
+ return 0;
+}
+
+static int fuse_access_backing(struct bpf_fuse_args *fa, int *out, struct inode *inode, int mask)
+{
+ struct fuse_inode *fi = get_fuse_inode(inode);
+ const struct fuse_access_in *fai = fa->in_args[0].value;
+
+ *out = inode_permission(&nop_mnt_idmap, fi->backing_inode, fai->mask);
+ return 0;
+}
+
+static int fuse_access_finalize(struct bpf_fuse_args *fa, int *out, struct inode *inode, int mask)
+{
+ return 0;
+}
+
+int fuse_bpf_access(int *out, struct inode *inode, int mask)
+{
+ return bpf_fuse_backing(inode, struct fuse_access_in, out,
+ fuse_access_initialize_in, fuse_access_initialize_out,
+ fuse_access_backing, fuse_access_finalize, inode, mask);
+}
diff --git a/fs/fuse/dir.c b/fs/fuse/dir.c
index 73ebe3498fb9..535e6cf9e970 100644
--- a/fs/fuse/dir.c
+++ b/fs/fuse/dir.c
@@ -1439,6 +1439,9 @@ static int fuse_access(struct inode *inode, int mask)
struct fuse_access_in inarg;
int err;

+ if (fuse_bpf_access(&err, inode, mask))
+ return err;
+
BUG_ON(mask & MAY_NOT_BLOCK);

if (fm->fc->no_access)
@@ -1495,6 +1498,9 @@ static int fuse_permission(struct mnt_idmap *idmap,
if (!fuse_allow_current_process(fc))
return -EACCES;

+ if (fuse_bpf_access(&err, inode, mask))
+ return err;
+
/*
* If attributes are needed, refresh them before proceeding
*/
diff --git a/fs/fuse/fuse_i.h b/fs/fuse/fuse_i.h
index 39a9fdf2a752..cb166168f9c2 100644
--- a/fs/fuse/fuse_i.h
+++ b/fs/fuse/fuse_i.h
@@ -1406,6 +1406,7 @@ int parse_fuse_bpf_entry(struct fuse_bpf_entry *fbe, int num_entries);
#ifdef CONFIG_FUSE_BPF

int fuse_bpf_lookup(struct dentry **out, struct inode *dir, struct dentry *entry, unsigned int flags);
+int fuse_bpf_access(int *out, struct inode *inode, int mask);

#else

@@ -1414,6 +1415,11 @@ static inline int fuse_bpf_lookup(struct dentry **out, struct inode *dir, struct
return 0;
}

+static inline int fuse_bpf_access(int *out, struct inode *inode, int mask)
+{
+ return 0;
+}
+
#endif // CONFIG_FUSE_BPF

int fuse_handle_backing(struct fuse_bpf_entry *feb, struct path *backing_path);
--
2.40.0.634.g4ca3ef3211-goog

2023-04-18 01:43:14

by Daniel Rosenberg

[permalink] [raw]
Subject: [RFC PATCH v3 12/37] fuse-bpf: Partially add mapping support

This adds a backing implementation for mapping, but is not currently
hooked into the infrastructure that will call the bpf programs.

Signed-off-by: Daniel Rosenberg <[email protected]>
Signed-off-by: Paul Lawrence <[email protected]>
---
fs/fuse/backing.c | 37 +++++++++++++++++++++++++++++++++++++
fs/fuse/file.c | 6 ++++++
fs/fuse/fuse_i.h | 4 +++-
3 files changed, 46 insertions(+), 1 deletion(-)

diff --git a/fs/fuse/backing.c b/fs/fuse/backing.c
index e42622584037..930aa370e376 100644
--- a/fs/fuse/backing.c
+++ b/fs/fuse/backing.c
@@ -207,6 +207,43 @@ static void fuse_stat_to_attr(struct fuse_conn *fc, struct inode *inode,
attr->blksize = 1 << blkbits;
}

+ssize_t fuse_backing_mmap(struct file *file, struct vm_area_struct *vma)
+{
+ int ret;
+ struct fuse_file *ff = file->private_data;
+ struct inode *fuse_inode = file_inode(file);
+ struct file *backing_file = ff->backing_file;
+ struct inode *backing_inode = file_inode(backing_file);
+
+ if (!backing_file->f_op->mmap)
+ return -ENODEV;
+
+ if (WARN_ON(file != vma->vm_file))
+ return -EIO;
+
+ vma->vm_file = get_file(backing_file);
+
+ ret = call_mmap(vma->vm_file, vma);
+
+ if (ret)
+ fput(backing_file);
+ else
+ fput(file);
+
+ if (file->f_flags & O_NOATIME)
+ return ret;
+
+ if ((!timespec64_equal(&fuse_inode->i_mtime, &backing_inode->i_mtime) ||
+ !timespec64_equal(&fuse_inode->i_ctime,
+ &backing_inode->i_ctime))) {
+ fuse_inode->i_mtime = backing_inode->i_mtime;
+ fuse_inode->i_ctime = backing_inode->i_ctime;
+ }
+ touch_atime(&file->f_path);
+
+ return ret;
+}
+
/*******************************************************************************
* Directory operations after here *
******************************************************************************/
diff --git a/fs/fuse/file.c b/fs/fuse/file.c
index 25fb49f0a9f7..865167a80d35 100644
--- a/fs/fuse/file.c
+++ b/fs/fuse/file.c
@@ -2527,6 +2527,12 @@ static int fuse_file_mmap(struct file *file, struct vm_area_struct *vma)
if (FUSE_IS_DAX(file_inode(file)))
return fuse_dax_mmap(file, vma);

+#ifdef CONFIG_FUSE_BPF
+ /* TODO - this is simply passthrough, not a proper BPF filter */
+ if (ff->backing_file)
+ return fuse_backing_mmap(file, vma);
+#endif
+
if (ff->open_flags & FOPEN_DIRECT_IO) {
/* Can't provide the coherency needed for MAP_SHARED */
if (vma->vm_flags & VM_MAYSHARE)
diff --git a/fs/fuse/fuse_i.h b/fs/fuse/fuse_i.h
index cb166168f9c2..5eb357f482dc 100644
--- a/fs/fuse/fuse_i.h
+++ b/fs/fuse/fuse_i.h
@@ -1422,7 +1422,9 @@ static inline int fuse_bpf_access(int *out, struct inode *inode, int mask)

#endif // CONFIG_FUSE_BPF

-int fuse_handle_backing(struct fuse_bpf_entry *feb, struct path *backing_path);
+ssize_t fuse_backing_mmap(struct file *file, struct vm_area_struct *vma);
+
+int fuse_handle_backing(struct fuse_bpf_entry *fbe, struct path *backing_path);

int fuse_revalidate_backing(struct dentry *entry, unsigned int flags);

--
2.40.0.634.g4ca3ef3211-goog

2023-04-18 01:43:32

by Daniel Rosenberg

[permalink] [raw]
Subject: [RFC PATCH v3 13/37] fuse-bpf: Add lseek support

This adds backing support for FUSE_LSEEK

Signed-off-by: Daniel Rosenberg <[email protected]>
Signed-off-by: Paul Lawrence <[email protected]>
---
fs/fuse/backing.c | 89 +++++++++++++++++++++++++++++++++++++++++++++++
fs/fuse/file.c | 3 ++
fs/fuse/fuse_i.h | 6 ++++
3 files changed, 98 insertions(+)

diff --git a/fs/fuse/backing.c b/fs/fuse/backing.c
index 930aa370e376..c4916dde48c8 100644
--- a/fs/fuse/backing.c
+++ b/fs/fuse/backing.c
@@ -207,6 +207,95 @@ static void fuse_stat_to_attr(struct fuse_conn *fc, struct inode *inode,
attr->blksize = 1 << blkbits;
}

+struct fuse_lseek_args {
+ struct fuse_lseek_in in;
+ struct fuse_lseek_out out;
+};
+
+static int fuse_lseek_initialize_in(struct bpf_fuse_args *fa, struct fuse_lseek_args *args,
+ struct file *file, loff_t offset, int whence)
+{
+ struct fuse_file *fuse_file = file->private_data;
+
+ args->in = (struct fuse_lseek_in) {
+ .fh = fuse_file->fh,
+ .offset = offset,
+ .whence = whence,
+ };
+
+ *fa = (struct bpf_fuse_args) {
+ .info = (struct bpf_fuse_meta_info) {
+ .nodeid = get_node_id(file->f_inode),
+ .opcode = FUSE_LSEEK,
+ },
+ .in_numargs = 1,
+ .in_args[0].size = sizeof(args->in),
+ .in_args[0].value = &args->in,
+ };
+
+ return 0;
+}
+
+static int fuse_lseek_initialize_out(struct bpf_fuse_args *fa, struct fuse_lseek_args *args,
+ struct file *file, loff_t offset, int whence)
+{
+ fa->out_numargs = 1;
+ fa->out_args[0].size = sizeof(args->out);
+ fa->out_args[0].value = &args->out;
+
+ return 0;
+}
+
+static int fuse_lseek_backing(struct bpf_fuse_args *fa, loff_t *out,
+ struct file *file, loff_t offset, int whence)
+{
+ const struct fuse_lseek_in *fli = fa->in_args[0].value;
+ struct fuse_lseek_out *flo = fa->out_args[0].value;
+ struct fuse_file *fuse_file = file->private_data;
+ struct file *backing_file = fuse_file->backing_file;
+
+ /* TODO: Handle changing of the file handle */
+ if (offset == 0) {
+ if (whence == SEEK_CUR) {
+ flo->offset = file->f_pos;
+ *out = flo->offset;
+ return 0;
+ }
+
+ if (whence == SEEK_SET) {
+ flo->offset = vfs_setpos(file, 0, 0);
+ *out = flo->offset;
+ return 0;
+ }
+ }
+
+ inode_lock(file->f_inode);
+ backing_file->f_pos = file->f_pos;
+ *out = vfs_llseek(backing_file, fli->offset, fli->whence);
+ flo->offset = *out;
+ inode_unlock(file->f_inode);
+ return 0;
+}
+
+static int fuse_lseek_finalize(struct bpf_fuse_args *fa, loff_t *out,
+ struct file *file, loff_t offset, int whence)
+{
+ struct fuse_lseek_out *flo = fa->out_args[0].value;
+
+ if (!fa->info.error_in)
+ file->f_pos = flo->offset;
+ *out = flo->offset;
+ return 0;
+}
+
+int fuse_bpf_lseek(loff_t *out, struct inode *inode, struct file *file, loff_t offset, int whence)
+{
+ return bpf_fuse_backing(inode, struct fuse_lseek_args, out,
+ fuse_lseek_initialize_in, fuse_lseek_initialize_out,
+ fuse_lseek_backing, fuse_lseek_finalize,
+ file, offset, whence);
+}
+
ssize_t fuse_backing_mmap(struct file *file, struct vm_area_struct *vma)
{
int ret;
diff --git a/fs/fuse/file.c b/fs/fuse/file.c
index 865167a80d35..9758bd1665a6 100644
--- a/fs/fuse/file.c
+++ b/fs/fuse/file.c
@@ -2779,6 +2779,9 @@ static loff_t fuse_file_llseek(struct file *file, loff_t offset, int whence)
loff_t retval;
struct inode *inode = file_inode(file);

+ if (fuse_bpf_lseek(&retval, inode, file, offset, whence))
+ return retval;
+
switch (whence) {
case SEEK_SET:
case SEEK_CUR:
diff --git a/fs/fuse/fuse_i.h b/fs/fuse/fuse_i.h
index 5eb357f482dc..624d0cebd287 100644
--- a/fs/fuse/fuse_i.h
+++ b/fs/fuse/fuse_i.h
@@ -1405,11 +1405,17 @@ int parse_fuse_bpf_entry(struct fuse_bpf_entry *fbe, int num_entries);

#ifdef CONFIG_FUSE_BPF

+int fuse_bpf_lseek(loff_t *out, struct inode *inode, struct file *file, loff_t offset, int whence);
int fuse_bpf_lookup(struct dentry **out, struct inode *dir, struct dentry *entry, unsigned int flags);
int fuse_bpf_access(int *out, struct inode *inode, int mask);

#else

+static inline int fuse_bpf_lseek(loff_t *out, struct inode *inode, struct file *file, loff_t offset, int whence)
+{
+ return 0;
+}
+
static inline int fuse_bpf_lookup(struct dentry **out, struct inode *dir, struct dentry *entry, unsigned int flags)
{
return 0;
--
2.40.0.634.g4ca3ef3211-goog

2023-04-18 01:43:46

by Daniel Rosenberg

[permalink] [raw]
Subject: [RFC PATCH v3 15/37] fuse-bpf: Support file/dir open/close

This adds backing support for FUSE_OPEN, FUSE_OPENDIR, FUSE_CREATE,
FUSE_RELEASE, and FUSE_RELEASEDIR

Signed-off-by: Daniel Rosenberg <[email protected]>
Signed-off-by: Paul Lawrence <[email protected]>
---
fs/fuse/backing.c | 368 ++++++++++++++++++++++++++++++++++++++++++++++
fs/fuse/dir.c | 8 +
fs/fuse/file.c | 7 +
fs/fuse/fuse_i.h | 26 ++++
4 files changed, 409 insertions(+)

diff --git a/fs/fuse/backing.c b/fs/fuse/backing.c
index ee315598bc3f..d4a214cadc15 100644
--- a/fs/fuse/backing.c
+++ b/fs/fuse/backing.c
@@ -207,6 +207,374 @@ static void fuse_stat_to_attr(struct fuse_conn *fc, struct inode *inode,
attr->blksize = 1 << blkbits;
}

+struct fuse_open_args {
+ struct fuse_open_in in;
+ struct fuse_open_out out;
+};
+
+static int fuse_open_initialize_in(struct bpf_fuse_args *fa, struct fuse_open_args *args,
+ struct inode *inode, struct file *file, bool isdir)
+{
+ args->in = (struct fuse_open_in) {
+ .flags = file->f_flags & ~(O_CREAT | O_EXCL | O_NOCTTY),
+ };
+ *fa = (struct bpf_fuse_args) {
+ .info = (struct bpf_fuse_meta_info) {
+ .nodeid = get_fuse_inode(inode)->nodeid,
+ .opcode = isdir ? FUSE_OPENDIR : FUSE_OPEN,
+ },
+ .in_numargs = 1,
+ .in_args[0] = (struct bpf_fuse_arg) {
+ .size = sizeof(args->in),
+ .value = &args->in,
+ },
+ };
+
+ return 0;
+}
+
+static int fuse_open_initialize_out(struct bpf_fuse_args *fa, struct fuse_open_args *args,
+ struct inode *inode, struct file *file, bool isdir)
+{
+ args->out = (struct fuse_open_out) { 0 };
+
+ fa->out_numargs = 1;
+ fa->out_args[0] = (struct bpf_fuse_arg) {
+ .size = sizeof(args->out),
+ .value = &args->out,
+ };
+
+ return 0;
+}
+
+static int fuse_open_backing(struct bpf_fuse_args *fa, int *out,
+ struct inode *inode, struct file *file, bool isdir)
+{
+ struct fuse_mount *fm = get_fuse_mount(inode);
+ const struct fuse_open_in *foi = fa->in_args[0].value;
+ struct fuse_file *ff;
+ int mask;
+ struct fuse_dentry *fd = get_fuse_dentry(file->f_path.dentry);
+ struct file *backing_file;
+
+ ff = fuse_file_alloc(fm);
+ if (!ff)
+ return -ENOMEM;
+ file->private_data = ff;
+
+ switch (foi->flags & O_ACCMODE) {
+ case O_RDONLY:
+ mask = MAY_READ;
+ break;
+
+ case O_WRONLY:
+ mask = MAY_WRITE;
+ break;
+
+ case O_RDWR:
+ mask = MAY_READ | MAY_WRITE;
+ break;
+
+ default:
+ return -EINVAL;
+ }
+
+ *out = inode_permission(&nop_mnt_idmap,
+ get_fuse_inode(inode)->backing_inode, mask);
+ if (*out)
+ return *out;
+
+ backing_file =
+ dentry_open(&fd->backing_path, foi->flags, current_cred());
+
+ if (IS_ERR(backing_file)) {
+ fuse_file_free(ff);
+ file->private_data = NULL;
+ return PTR_ERR(backing_file);
+ }
+ ff->backing_file = backing_file;
+
+ *out = 0;
+ return 0;
+}
+
+static int fuse_open_finalize(struct bpf_fuse_args *fa, int *out,
+ struct inode *inode, struct file *file, bool isdir)
+{
+ struct fuse_file *ff = file->private_data;
+ struct fuse_open_out *foo = fa->out_args[0].value;
+
+ if (ff) {
+ ff->fh = foo->fh;
+ ff->nodeid = get_fuse_inode(inode)->nodeid;
+ }
+ return 0;
+}
+
+int fuse_bpf_open(int *out, struct inode *inode, struct file *file, bool isdir)
+{
+ return bpf_fuse_backing(inode, struct fuse_open_args, out,
+ fuse_open_initialize_in, fuse_open_initialize_out,
+ fuse_open_backing, fuse_open_finalize,
+ inode, file, isdir);
+}
+
+struct fuse_create_open_args {
+ struct fuse_create_in in;
+ struct fuse_buffer name;
+ struct fuse_entry_out entry_out;
+ struct fuse_open_out open_out;
+};
+
+static int fuse_create_open_initialize_in(struct bpf_fuse_args *fa, struct fuse_create_open_args *args,
+ struct inode *dir, struct dentry *entry,
+ struct file *file, unsigned int flags, umode_t mode)
+{
+ args->in = (struct fuse_create_in) {
+ .flags = file->f_flags & ~(O_CREAT | O_EXCL | O_NOCTTY),
+ .mode = mode,
+ };
+
+ args->name = (struct fuse_buffer) {
+ .data = (void *) entry->d_name.name,
+ .size = entry->d_name.len + 1,
+ .flags = BPF_FUSE_IMMUTABLE,
+ };
+
+ *fa = (struct bpf_fuse_args) {
+ .info = (struct bpf_fuse_meta_info) {
+ .nodeid = get_node_id(dir),
+ .opcode = FUSE_CREATE,
+ },
+ .in_numargs = 2,
+ .in_args[0] = (struct bpf_fuse_arg) {
+ .size = sizeof(args->in),
+ .value = &args->in,
+ },
+ .in_args[1] = (struct bpf_fuse_arg) {
+ .is_buffer = true,
+ .buffer = &args->name,
+ },
+ };
+
+ return 0;
+}
+
+static int fuse_create_open_initialize_out(struct bpf_fuse_args *fa, struct fuse_create_open_args *args,
+ struct inode *dir, struct dentry *entry,
+ struct file *file, unsigned int flags, umode_t mode)
+{
+ args->entry_out = (struct fuse_entry_out) { 0 };
+ args->open_out = (struct fuse_open_out) { 0 };
+
+ fa->out_numargs = 2;
+ fa->out_args[0] = (struct bpf_fuse_arg) {
+ .size = sizeof(args->entry_out),
+ .value = &args->entry_out,
+ };
+ fa->out_args[1] = (struct bpf_fuse_arg) {
+ .size = sizeof(args->open_out),
+ .value = &args->open_out,
+ };
+
+ return 0;
+}
+
+static int fuse_open_file_backing(struct inode *inode, struct file *file)
+{
+ struct fuse_mount *fm = get_fuse_mount(inode);
+ struct dentry *entry = file->f_path.dentry;
+ struct fuse_dentry *fuse_dentry = get_fuse_dentry(entry);
+ struct fuse_file *fuse_file;
+ struct file *backing_file;
+
+ fuse_file = fuse_file_alloc(fm);
+ if (!fuse_file)
+ return -ENOMEM;
+ file->private_data = fuse_file;
+
+ backing_file = dentry_open(&fuse_dentry->backing_path, file->f_flags,
+ current_cred());
+ if (IS_ERR(backing_file)) {
+ fuse_file_free(fuse_file);
+ file->private_data = NULL;
+ return PTR_ERR(backing_file);
+ }
+ fuse_file->backing_file = backing_file;
+
+ return 0;
+}
+
+static int fuse_create_open_backing(struct bpf_fuse_args *fa, int *out,
+ struct inode *dir, struct dentry *entry,
+ struct file *file, unsigned int flags, umode_t mode)
+{
+ struct fuse_inode *dir_fuse_inode = get_fuse_inode(dir);
+ struct path backing_path;
+ struct inode *inode = NULL;
+ struct dentry *backing_parent;
+ struct dentry *newent;
+ const struct fuse_create_in *fci = fa->in_args[0].value;
+
+ get_fuse_backing_path(entry, &backing_path);
+ if (!backing_path.dentry)
+ return -EBADF;
+
+ if (IS_ERR(backing_path.dentry))
+ return PTR_ERR(backing_path.dentry);
+
+ if (d_really_is_positive(backing_path.dentry)) {
+ *out = -EIO;
+ goto out;
+ }
+
+ backing_parent = dget_parent(backing_path.dentry);
+ inode_lock_nested(dir_fuse_inode->backing_inode, I_MUTEX_PARENT);
+ *out = vfs_create(&nop_mnt_idmap, d_inode(backing_parent),
+ backing_path.dentry, fci->mode, true);
+ inode_unlock(d_inode(backing_parent));
+ dput(backing_parent);
+ if (*out)
+ goto out;
+
+ inode = fuse_iget_backing(dir->i_sb, 0, backing_path.dentry->d_inode);
+ if (IS_ERR(inode)) {
+ *out = PTR_ERR(inode);
+ goto out;
+ }
+
+ newent = d_splice_alias(inode, entry);
+ if (IS_ERR(newent)) {
+ *out = PTR_ERR(newent);
+ goto out;
+ }
+
+ entry = newent ? newent : entry;
+ *out = finish_open(file, entry, fuse_open_file_backing);
+
+out:
+ path_put(&backing_path);
+ return *out;
+}
+
+static int fuse_create_open_finalize(struct bpf_fuse_args *fa, int *out,
+ struct inode *dir, struct dentry *entry,
+ struct file *file, unsigned int flags, umode_t mode)
+{
+ struct fuse_file *ff = file->private_data;
+ struct fuse_inode *fi = get_fuse_inode(file->f_inode);
+ struct fuse_entry_out *feo = fa->out_args[0].value;
+ struct fuse_open_out *foo = fa->out_args[1].value;
+
+ if (fi)
+ fi->nodeid = feo->nodeid;
+ if (ff)
+ ff->fh = foo->fh;
+ return 0;
+}
+
+int fuse_bpf_create_open(int *out, struct inode *dir, struct dentry *entry,
+ struct file *file, unsigned int flags, umode_t mode)
+{
+ return bpf_fuse_backing(dir, struct fuse_create_open_args, out,
+ fuse_create_open_initialize_in,
+ fuse_create_open_initialize_out,
+ fuse_create_open_backing,
+ fuse_create_open_finalize,
+ dir, entry, file, flags, mode);
+}
+
+static int fuse_release_initialize_in(struct bpf_fuse_args *fa, struct fuse_release_in *fri,
+ struct inode *inode, struct file *file)
+{
+ struct fuse_file *fuse_file = file->private_data;
+
+ /* Always put backing file whatever bpf/userspace says */
+ fput(fuse_file->backing_file);
+
+ *fri = (struct fuse_release_in) {
+ .fh = ((struct fuse_file *)(file->private_data))->fh,
+ };
+
+ *fa = (struct bpf_fuse_args) {
+ .info = (struct bpf_fuse_meta_info) {
+ .nodeid = get_fuse_inode(inode)->nodeid,
+ .opcode = FUSE_RELEASE,
+ }, .in_numargs = 1,
+ .in_args[0].size = sizeof(*fri),
+ .in_args[0].value = fri,
+ };
+
+ return 0;
+}
+
+static int fuse_release_initialize_out(struct bpf_fuse_args *fa, struct fuse_release_in *fri,
+ struct inode *inode, struct file *file)
+{
+ return 0;
+}
+
+static int fuse_releasedir_initialize_in(struct bpf_fuse_args *fa,
+ struct fuse_release_in *fri,
+ struct inode *inode, struct file *file)
+{
+ struct fuse_file *fuse_file = file->private_data;
+
+ /* Always put backing file whatever bpf/userspace says */
+ fput(fuse_file->backing_file);
+
+ *fri = (struct fuse_release_in) {
+ .fh = ((struct fuse_file *)(file->private_data))->fh,
+ };
+
+ *fa = (struct bpf_fuse_args) {
+ .info = (struct bpf_fuse_meta_info) {
+ .nodeid = get_fuse_inode(inode)->nodeid,
+ .opcode = FUSE_RELEASEDIR,
+ }, .in_numargs = 1,
+ .in_args[0].size = sizeof(*fri),
+ .in_args[0].value = fri,
+ };
+
+ return 0;
+}
+
+static int fuse_releasedir_initialize_out(struct bpf_fuse_args *fa,
+ struct fuse_release_in *fri,
+ struct inode *inode, struct file *file)
+{
+ return 0;
+}
+
+static int fuse_release_backing(struct bpf_fuse_args *fa, int *out,
+ struct inode *inode, struct file *file)
+{
+ return 0;
+}
+
+static int fuse_release_finalize(struct bpf_fuse_args *fa, int *out,
+ struct inode *inode, struct file *file)
+{
+ fuse_file_free(file->private_data);
+ *out = 0;
+ return 0;
+}
+
+int fuse_bpf_release(int *out, struct inode *inode, struct file *file)
+{
+ return bpf_fuse_backing(inode, struct fuse_release_in, out,
+ fuse_release_initialize_in, fuse_release_initialize_out,
+ fuse_release_backing, fuse_release_finalize,
+ inode, file);
+}
+
+int fuse_bpf_releasedir(int *out, struct inode *inode, struct file *file)
+{
+ return bpf_fuse_backing(inode, struct fuse_release_in, out,
+ fuse_releasedir_initialize_in, fuse_releasedir_initialize_out,
+ fuse_release_backing, fuse_release_finalize, inode, file);
+}
+
struct fuse_lseek_args {
struct fuse_lseek_in in;
struct fuse_lseek_out out;
diff --git a/fs/fuse/dir.c b/fs/fuse/dir.c
index 535e6cf9e970..1df2bbc72396 100644
--- a/fs/fuse/dir.c
+++ b/fs/fuse/dir.c
@@ -719,6 +719,9 @@ static int fuse_create_open(struct inode *dir, struct dentry *entry,
/* Userspace expects S_IFREG in create mode */
BUG_ON((mode & S_IFMT) != S_IFREG);

+ if (fuse_bpf_create_open(&err, dir, entry, file, flags, mode))
+ return err;
+
forget = fuse_alloc_forget();
err = -ENOMEM;
if (!forget)
@@ -1629,6 +1632,11 @@ static int fuse_dir_open(struct inode *inode, struct file *file)

static int fuse_dir_release(struct inode *inode, struct file *file)
{
+ int err = 0;
+
+ if (fuse_bpf_releasedir(&err, inode, file))
+ return err;
+
fuse_release_common(file, true);

return 0;
diff --git a/fs/fuse/file.c b/fs/fuse/file.c
index 58cff04660db..1836d09d9ce3 100644
--- a/fs/fuse/file.c
+++ b/fs/fuse/file.c
@@ -243,6 +243,9 @@ int fuse_open_common(struct inode *inode, struct file *file, bool isdir)
if (err)
return err;

+ if (fuse_bpf_open(&err, inode, file, isdir))
+ return err;
+
if (is_wb_truncate || dax_truncate)
inode_lock(inode);

@@ -351,6 +354,10 @@ static int fuse_open(struct inode *inode, struct file *file)
static int fuse_release(struct inode *inode, struct file *file)
{
struct fuse_conn *fc = get_fuse_conn(inode);
+ int err;
+
+ if (fuse_bpf_release(&err, inode, file))
+ return err;

/*
* Dirty pages might remain despite write_inode_now() call from
diff --git a/fs/fuse/fuse_i.h b/fs/fuse/fuse_i.h
index 1dd9cc9720df..feecc1ebfdda 100644
--- a/fs/fuse/fuse_i.h
+++ b/fs/fuse/fuse_i.h
@@ -1405,6 +1405,11 @@ int parse_fuse_bpf_entry(struct fuse_bpf_entry *fbe, int num_entries);

#ifdef CONFIG_FUSE_BPF

+int fuse_bpf_open(int *err, struct inode *inode, struct file *file, bool isdir);
+int fuse_bpf_create_open(int *out, struct inode *dir, struct dentry *entry,
+ struct file *file, unsigned int flags, umode_t mode);
+int fuse_bpf_release(int *out, struct inode *inode, struct file *file);
+int fuse_bpf_releasedir(int *out, struct inode *inode, struct file *file);
int fuse_bpf_lseek(loff_t *out, struct inode *inode, struct file *file, loff_t offset, int whence);
int fuse_bpf_file_fallocate(int *out, struct inode *inode, struct file *file, int mode, loff_t offset, loff_t length);
int fuse_bpf_lookup(struct dentry **out, struct inode *dir, struct dentry *entry, unsigned int flags);
@@ -1412,6 +1417,27 @@ int fuse_bpf_access(int *out, struct inode *inode, int mask);

#else

+static inline int fuse_bpf_open(int *err, struct inode *inode, struct file *file, bool isdir)
+{
+ return 0;
+}
+
+static inline int fuse_bpf_create_open(int *out, struct inode *dir, struct dentry *entry,
+ struct file *file, unsigned int flags, umode_t mode)
+{
+ return 0;
+}
+
+static inline int fuse_bpf_release(int *out, struct inode *inode, struct file *file)
+{
+ return 0;
+}
+
+static inline int fuse_bpf_releasedir(int *out, struct inode *inode, struct file *file)
+{
+ return 0;
+}
+
static inline int fuse_bpf_lseek(loff_t *out, struct inode *inode, struct file *file, loff_t offset, int whence)
{
return 0;
--
2.40.0.634.g4ca3ef3211-goog

2023-04-18 01:43:50

by Daniel Rosenberg

[permalink] [raw]
Subject: [RFC PATCH v3 14/37] fuse-bpf: Add support for fallocate

This adds backing support for FUSE_FALLOCATE

Signed-off-by: Daniel Rosenberg <[email protected]>
Signed-off-by: Paul Lawrence <[email protected]>
---
fs/fuse/backing.c | 60 +++++++++++++++++++++++++++++++++++++++++++++++
fs/fuse/file.c | 3 +++
fs/fuse/fuse_i.h | 6 +++++
3 files changed, 69 insertions(+)

diff --git a/fs/fuse/backing.c b/fs/fuse/backing.c
index c4916dde48c8..ee315598bc3f 100644
--- a/fs/fuse/backing.c
+++ b/fs/fuse/backing.c
@@ -333,6 +333,66 @@ ssize_t fuse_backing_mmap(struct file *file, struct vm_area_struct *vma)
return ret;
}

+static int fuse_file_fallocate_initialize_in(struct bpf_fuse_args *fa,
+ struct fuse_fallocate_in *in,
+ struct file *file, int mode, loff_t offset, loff_t length)
+{
+ struct fuse_file *ff = file->private_data;
+
+ *in = (struct fuse_fallocate_in) {
+ .fh = ff->fh,
+ .offset = offset,
+ .length = length,
+ .mode = mode,
+ };
+
+ *fa = (struct bpf_fuse_args) {
+ .info = (struct bpf_fuse_meta_info) {
+ .opcode = FUSE_FALLOCATE,
+ .nodeid = ff->nodeid,
+ },
+ .in_numargs = 1,
+ .in_args[0].size = sizeof(*in),
+ .in_args[0].value = in,
+ };
+
+ return 0;
+}
+
+static int fuse_file_fallocate_initialize_out(struct bpf_fuse_args *fa,
+ struct fuse_fallocate_in *in,
+ struct file *file, int mode, loff_t offset, loff_t length)
+{
+ return 0;
+}
+
+static int fuse_file_fallocate_backing(struct bpf_fuse_args *fa, int *out,
+ struct file *file, int mode, loff_t offset, loff_t length)
+{
+ const struct fuse_fallocate_in *ffi = fa->in_args[0].value;
+ struct fuse_file *ff = file->private_data;
+
+ *out = vfs_fallocate(ff->backing_file, ffi->mode, ffi->offset,
+ ffi->length);
+ return 0;
+}
+
+static int fuse_file_fallocate_finalize(struct bpf_fuse_args *fa, int *out,
+ struct file *file, int mode, loff_t offset, loff_t length)
+{
+ return 0;
+}
+
+int fuse_bpf_file_fallocate(int *out, struct inode *inode, struct file *file, int mode, loff_t offset, loff_t length)
+{
+ return bpf_fuse_backing(inode, struct fuse_fallocate_in, out,
+ fuse_file_fallocate_initialize_in,
+ fuse_file_fallocate_initialize_out,
+ fuse_file_fallocate_backing,
+ fuse_file_fallocate_finalize,
+ file, mode, offset, length);
+}
+
/*******************************************************************************
* Directory operations after here *
******************************************************************************/
diff --git a/fs/fuse/file.c b/fs/fuse/file.c
index 9758bd1665a6..58cff04660db 100644
--- a/fs/fuse/file.c
+++ b/fs/fuse/file.c
@@ -3071,6 +3071,9 @@ static long fuse_file_fallocate(struct file *file, int mode, loff_t offset,
(!(mode & FALLOC_FL_KEEP_SIZE) ||
(mode & (FALLOC_FL_PUNCH_HOLE | FALLOC_FL_ZERO_RANGE)));

+ if (fuse_bpf_file_fallocate(&err, inode, file, mode, offset, length))
+ return err;
+
if (mode & ~(FALLOC_FL_KEEP_SIZE | FALLOC_FL_PUNCH_HOLE |
FALLOC_FL_ZERO_RANGE))
return -EOPNOTSUPP;
diff --git a/fs/fuse/fuse_i.h b/fs/fuse/fuse_i.h
index 624d0cebd287..1dd9cc9720df 100644
--- a/fs/fuse/fuse_i.h
+++ b/fs/fuse/fuse_i.h
@@ -1406,6 +1406,7 @@ int parse_fuse_bpf_entry(struct fuse_bpf_entry *fbe, int num_entries);
#ifdef CONFIG_FUSE_BPF

int fuse_bpf_lseek(loff_t *out, struct inode *inode, struct file *file, loff_t offset, int whence);
+int fuse_bpf_file_fallocate(int *out, struct inode *inode, struct file *file, int mode, loff_t offset, loff_t length);
int fuse_bpf_lookup(struct dentry **out, struct inode *dir, struct dentry *entry, unsigned int flags);
int fuse_bpf_access(int *out, struct inode *inode, int mask);

@@ -1416,6 +1417,11 @@ static inline int fuse_bpf_lseek(loff_t *out, struct inode *inode, struct file *
return 0;
}

+static inline int fuse_bpf_file_fallocate(int *out, struct inode *inode, struct file *file, int mode, loff_t offset, loff_t length)
+{
+ return 0;
+}
+
static inline int fuse_bpf_lookup(struct dentry **out, struct inode *dir, struct dentry *entry, unsigned int flags)
{
return 0;
--
2.40.0.634.g4ca3ef3211-goog

2023-04-18 01:43:51

by Daniel Rosenberg

[permalink] [raw]
Subject: [RFC PATCH v3 16/37] fuse-bpf: Support mknod/unlink/mkdir/rmdir

This adds backing support for FUSE_MKNOD, FUSE_MKDIR, FUSE_RMDIR,
and FUSE_UNLINK

Signed-off-by: Daniel Rosenberg <[email protected]>
Signed-off-by: Paul Lawrence <[email protected]>
---
fs/fuse/backing.c | 342 ++++++++++++++++++++++++++++++++++++++++++++++
fs/fuse/dir.c | 14 ++
fs/fuse/fuse_i.h | 24 ++++
3 files changed, 380 insertions(+)

diff --git a/fs/fuse/backing.c b/fs/fuse/backing.c
index d4a214cadc15..c6ef10aeec15 100644
--- a/fs/fuse/backing.c
+++ b/fs/fuse/backing.c
@@ -972,6 +972,348 @@ int fuse_revalidate_backing(struct dentry *entry, unsigned int flags)
return 1;
}

+struct fuse_mknod_args {
+ struct fuse_mknod_in in;
+ struct fuse_buffer name;
+};
+
+static int fuse_mknod_initialize_in(struct bpf_fuse_args *fa, struct fuse_mknod_args *args,
+ struct inode *dir, struct dentry *entry, umode_t mode, dev_t rdev)
+{
+ *args = (struct fuse_mknod_args) {
+ .in = (struct fuse_mknod_in) {
+ .mode = mode,
+ .rdev = new_encode_dev(rdev),
+ .umask = current_umask(),
+ },
+ .name = (struct fuse_buffer) {
+ .data = (void *) entry->d_name.name,
+ .size = entry->d_name.len + 1,
+ .flags = BPF_FUSE_IMMUTABLE,
+ },
+ };
+ *fa = (struct bpf_fuse_args) {
+ .info = (struct bpf_fuse_meta_info) {
+ .nodeid = get_node_id(dir),
+ .opcode = FUSE_MKNOD,
+ },
+ .in_numargs = 2,
+ .in_args[0] = (struct bpf_fuse_arg) {
+ .size = sizeof(args->in),
+ .value = &args->in,
+ },
+ .in_args[1] = (struct bpf_fuse_arg) {
+ .is_buffer = true,
+ .buffer = &args->name,
+ },
+ };
+
+ return 0;
+}
+
+static int fuse_mknod_initialize_out(struct bpf_fuse_args *fa, struct fuse_mknod_args *args,
+ struct inode *dir, struct dentry *entry, umode_t mode, dev_t rdev)
+{
+ return 0;
+}
+
+static int fuse_mknod_backing(struct bpf_fuse_args *fa, int *out,
+ struct inode *dir, struct dentry *entry, umode_t mode, dev_t rdev)
+{
+ const struct fuse_mknod_in *fmi = fa->in_args[0].value;
+ struct fuse_inode *fuse_inode = get_fuse_inode(dir);
+ struct inode *backing_inode = fuse_inode->backing_inode;
+ struct path backing_path;
+ struct inode *inode = NULL;
+
+ get_fuse_backing_path(entry, &backing_path);
+ if (!backing_path.dentry)
+ return -EBADF;
+
+ inode_lock_nested(backing_inode, I_MUTEX_PARENT);
+ mode = fmi->mode;
+ if (!IS_POSIXACL(backing_inode))
+ mode &= ~fmi->umask;
+ *out = vfs_mknod(&nop_mnt_idmap, backing_inode, backing_path.dentry, mode,
+ new_decode_dev(fmi->rdev));
+ inode_unlock(backing_inode);
+ if (*out)
+ goto out;
+ if (d_really_is_negative(backing_path.dentry) ||
+ unlikely(d_unhashed(backing_path.dentry))) {
+ *out = -EINVAL;
+ /**
+ * TODO: overlayfs responds to this situation with a
+ * lookupOneLen. Should we do that too?
+ */
+ goto out;
+ }
+ inode = fuse_iget_backing(dir->i_sb, fuse_inode->nodeid, backing_inode);
+ if (IS_ERR(inode)) {
+ *out = PTR_ERR(inode);
+ goto out;
+ }
+ d_instantiate(entry, inode);
+out:
+ path_put(&backing_path);
+ return *out;
+}
+
+static int fuse_mknod_finalize(struct bpf_fuse_args *fa, int *out,
+ struct inode *dir, struct dentry *entry, umode_t mode, dev_t rdev)
+{
+ return 0;
+}
+
+int fuse_bpf_mknod(int *out, struct inode *dir, struct dentry *entry, umode_t mode, dev_t rdev)
+{
+ return bpf_fuse_backing(dir, struct fuse_mknod_args, out,
+ fuse_mknod_initialize_in, fuse_mknod_initialize_out,
+ fuse_mknod_backing, fuse_mknod_finalize,
+ dir, entry, mode, rdev);
+}
+
+struct fuse_mkdir_args {
+ struct fuse_mkdir_in in;
+ struct fuse_buffer name;
+};
+
+static int fuse_mkdir_initialize_in(struct bpf_fuse_args *fa, struct fuse_mkdir_args *args,
+ struct inode *dir, struct dentry *entry, umode_t mode)
+{
+ *args = (struct fuse_mkdir_args) {
+ .in = (struct fuse_mkdir_in) {
+ .mode = mode,
+ .umask = current_umask(),
+ },
+ .name = (struct fuse_buffer) {
+ .data = (void *) entry->d_name.name,
+ .size = entry->d_name.len + 1,
+ .flags = BPF_FUSE_IMMUTABLE,
+ },
+ };
+
+ *fa = (struct bpf_fuse_args) {
+ .info = (struct bpf_fuse_meta_info) {
+ .nodeid = get_node_id(dir),
+ .opcode = FUSE_MKDIR,
+ },
+ .in_numargs = 2,
+ .in_args[0] = (struct bpf_fuse_arg) {
+ .size = sizeof(args->in),
+ .value = &args->in,
+ },
+ .in_args[1] = (struct bpf_fuse_arg) {
+ .value = &args->name,
+ .is_buffer = true,
+ },
+ };
+
+ return 0;
+}
+
+static int fuse_mkdir_initialize_out(struct bpf_fuse_args *fa, struct fuse_mkdir_args *args,
+ struct inode *dir, struct dentry *entry, umode_t mode)
+{
+ return 0;
+}
+
+static int fuse_mkdir_backing(struct bpf_fuse_args *fa, int *out,
+ struct inode *dir, struct dentry *entry, umode_t mode)
+{
+ const struct fuse_mkdir_in *fmi = fa->in_args[0].value;
+ struct fuse_inode *fuse_inode = get_fuse_inode(dir);
+ struct inode *backing_inode = fuse_inode->backing_inode;
+ struct path backing_path;
+ struct inode *inode = NULL;
+ struct dentry *d;
+
+ get_fuse_backing_path(entry, &backing_path);
+ if (!backing_path.dentry)
+ return -EBADF;
+
+ inode_lock_nested(backing_inode, I_MUTEX_PARENT);
+ mode = fmi->mode;
+ if (!IS_POSIXACL(backing_inode))
+ mode &= ~fmi->umask;
+ *out = vfs_mkdir(&nop_mnt_idmap, backing_inode, backing_path.dentry,
+ mode);
+ if (*out)
+ goto out;
+ if (d_really_is_negative(backing_path.dentry) ||
+ unlikely(d_unhashed(backing_path.dentry))) {
+ d = lookup_one_len(entry->d_name.name,
+ backing_path.dentry->d_parent,
+ entry->d_name.len);
+ if (IS_ERR(d)) {
+ *out = PTR_ERR(d);
+ goto out;
+ }
+ dput(backing_path.dentry);
+ backing_path.dentry = d;
+ }
+ inode = fuse_iget_backing(dir->i_sb, fuse_inode->nodeid, backing_inode);
+ if (IS_ERR(inode)) {
+ *out = PTR_ERR(inode);
+ goto out;
+ }
+ d_instantiate(entry, inode);
+out:
+ inode_unlock(backing_inode);
+ path_put(&backing_path);
+ return *out;
+}
+
+static int fuse_mkdir_finalize(struct bpf_fuse_args *fa, int *out,
+ struct inode *dir, struct dentry *entry, umode_t mode)
+{
+ return 0;
+}
+
+int fuse_bpf_mkdir(int *out, struct inode *dir, struct dentry *entry, umode_t mode)
+{
+ return bpf_fuse_backing(dir, struct fuse_mkdir_args, out,
+ fuse_mkdir_initialize_in, fuse_mkdir_initialize_out,
+ fuse_mkdir_backing, fuse_mkdir_finalize,
+ dir, entry, mode);
+}
+
+static int fuse_rmdir_initialize_in(struct bpf_fuse_args *fa, struct fuse_buffer *name,
+ struct inode *dir, struct dentry *entry)
+{
+ *name = (struct fuse_buffer) {
+ .data = (void *) entry->d_name.name,
+ .size = entry->d_name.len + 1,
+ .flags = BPF_FUSE_IMMUTABLE,
+ };
+ *fa = (struct bpf_fuse_args) {
+ .info = (struct bpf_fuse_meta_info) {
+ .nodeid = get_node_id(dir),
+ .opcode = FUSE_RMDIR,
+ },
+ .in_numargs = 1,
+ .in_args[0] = (struct bpf_fuse_arg) {
+ .is_buffer = true,
+ .buffer = name,
+ },
+ };
+
+ return 0;
+}
+
+static int fuse_rmdir_initialize_out(struct bpf_fuse_args *fa, struct fuse_buffer *name,
+ struct inode *dir, struct dentry *entry)
+{
+ return 0;
+}
+
+static int fuse_rmdir_backing(struct bpf_fuse_args *fa, int *out,
+ struct inode *dir, struct dentry *entry)
+{
+ struct path backing_path;
+ struct dentry *backing_parent_dentry;
+ struct inode *backing_inode;
+
+ get_fuse_backing_path(entry, &backing_path);
+ if (!backing_path.dentry)
+ return -EBADF;
+
+ backing_parent_dentry = dget_parent(backing_path.dentry);
+ backing_inode = d_inode(backing_parent_dentry);
+
+ inode_lock_nested(backing_inode, I_MUTEX_PARENT);
+ *out = vfs_rmdir(&nop_mnt_idmap, backing_inode, backing_path.dentry);
+ inode_unlock(backing_inode);
+
+ dput(backing_parent_dentry);
+ if (!*out)
+ d_drop(entry);
+ path_put(&backing_path);
+ return *out;
+}
+
+static int fuse_rmdir_finalize(struct bpf_fuse_args *fa, int *out, struct inode *dir, struct dentry *entry)
+{
+ return 0;
+}
+
+int fuse_bpf_rmdir(int *out, struct inode *dir, struct dentry *entry)
+{
+ return bpf_fuse_backing(dir, struct fuse_buffer, out,
+ fuse_rmdir_initialize_in, fuse_rmdir_initialize_out,
+ fuse_rmdir_backing, fuse_rmdir_finalize,
+ dir, entry);
+}
+
+static int fuse_unlink_initialize_in(struct bpf_fuse_args *fa, struct fuse_buffer *name,
+ struct inode *dir, struct dentry *entry)
+{
+ *name = (struct fuse_buffer) {
+ .data = (void *) entry->d_name.name,
+ .size = entry->d_name.len + 1,
+ .flags = BPF_FUSE_IMMUTABLE,
+ };
+ *fa = (struct bpf_fuse_args) {
+ .info = (struct bpf_fuse_meta_info) {
+ .nodeid = get_node_id(dir),
+ .opcode = FUSE_UNLINK,
+ },
+ .in_numargs = 1,
+ .in_args[0] = (struct bpf_fuse_arg) {
+ .is_buffer = true,
+ .buffer = name,
+ },
+ };
+
+ return 0;
+}
+
+static int fuse_unlink_initialize_out(struct bpf_fuse_args *fa, struct fuse_buffer *name,
+ struct inode *dir, struct dentry *entry)
+{
+ return 0;
+}
+
+static int fuse_unlink_backing(struct bpf_fuse_args *fa, int *out, struct inode *dir, struct dentry *entry)
+{
+ struct path backing_path;
+ struct dentry *backing_parent_dentry;
+ struct inode *backing_inode;
+
+ get_fuse_backing_path(entry, &backing_path);
+ if (!backing_path.dentry)
+ return -EBADF;
+
+ /* TODO Not sure if we should reverify like overlayfs, or get inode from d_parent */
+ backing_parent_dentry = dget_parent(backing_path.dentry);
+ backing_inode = d_inode(backing_parent_dentry);
+
+ inode_lock_nested(backing_inode, I_MUTEX_PARENT);
+ *out = vfs_unlink(&nop_mnt_idmap, backing_inode, backing_path.dentry,
+ NULL);
+ inode_unlock(backing_inode);
+
+ dput(backing_parent_dentry);
+ if (!*out)
+ d_drop(entry);
+ path_put(&backing_path);
+ return *out;
+}
+
+static int fuse_unlink_finalize(struct bpf_fuse_args *fa, int *out,
+ struct inode *dir, struct dentry *entry)
+{
+ return 0;
+}
+
+int fuse_bpf_unlink(int *out, struct inode *dir, struct dentry *entry)
+{
+ return bpf_fuse_backing(dir, struct fuse_buffer, out,
+ fuse_unlink_initialize_in, fuse_unlink_initialize_out,
+ fuse_unlink_backing, fuse_unlink_finalize,
+ dir, entry);
+}
+
static int fuse_access_initialize_in(struct bpf_fuse_args *fa, struct fuse_access_in *in,
struct inode *inode, int mask)
{
diff --git a/fs/fuse/dir.c b/fs/fuse/dir.c
index 1df2bbc72396..a763a45fa973 100644
--- a/fs/fuse/dir.c
+++ b/fs/fuse/dir.c
@@ -937,6 +937,10 @@ static int fuse_mknod(struct mnt_idmap *idmap, struct inode *dir,
struct fuse_mknod_in inarg;
struct fuse_mount *fm = get_fuse_mount(dir);
FUSE_ARGS(args);
+ int err;
+
+ if (fuse_bpf_mknod(&err, dir, entry, mode, rdev))
+ return err;

if (!fm->fc->dont_mask)
mode &= ~current_umask();
@@ -983,6 +987,10 @@ static int fuse_mkdir(struct mnt_idmap *idmap, struct inode *dir,
struct fuse_mkdir_in inarg;
struct fuse_mount *fm = get_fuse_mount(dir);
FUSE_ARGS(args);
+ int err;
+
+ if (fuse_bpf_mkdir(&err, dir, entry, mode))
+ return err;

if (!fm->fc->dont_mask)
mode &= ~current_umask();
@@ -1069,6 +1077,9 @@ static int fuse_unlink(struct inode *dir, struct dentry *entry)
if (fuse_is_bad(dir))
return -EIO;

+ if (fuse_bpf_unlink(&err, dir, entry))
+ return err;
+
args.opcode = FUSE_UNLINK;
args.nodeid = get_node_id(dir);
args.in_numargs = 1;
@@ -1092,6 +1103,9 @@ static int fuse_rmdir(struct inode *dir, struct dentry *entry)
if (fuse_is_bad(dir))
return -EIO;

+ if (fuse_bpf_rmdir(&err, dir, entry))
+ return err;
+
args.opcode = FUSE_RMDIR;
args.nodeid = get_node_id(dir);
args.in_numargs = 1;
diff --git a/fs/fuse/fuse_i.h b/fs/fuse/fuse_i.h
index feecc1ebfdda..2cbe232c1048 100644
--- a/fs/fuse/fuse_i.h
+++ b/fs/fuse/fuse_i.h
@@ -1408,6 +1408,10 @@ int parse_fuse_bpf_entry(struct fuse_bpf_entry *fbe, int num_entries);
int fuse_bpf_open(int *err, struct inode *inode, struct file *file, bool isdir);
int fuse_bpf_create_open(int *out, struct inode *dir, struct dentry *entry,
struct file *file, unsigned int flags, umode_t mode);
+int fuse_bpf_mknod(int *out, struct inode *dir, struct dentry *entry, umode_t mode, dev_t rdev);
+int fuse_bpf_mkdir(int *out, struct inode *dir, struct dentry *entry, umode_t mode);
+int fuse_bpf_rmdir(int *out, struct inode *dir, struct dentry *entry);
+int fuse_bpf_unlink(int *out, struct inode *dir, struct dentry *entry);
int fuse_bpf_release(int *out, struct inode *inode, struct file *file);
int fuse_bpf_releasedir(int *out, struct inode *inode, struct file *file);
int fuse_bpf_lseek(loff_t *out, struct inode *inode, struct file *file, loff_t offset, int whence);
@@ -1428,6 +1432,26 @@ static inline int fuse_bpf_create_open(int *out, struct inode *dir, struct dentr
return 0;
}

+static inline int fuse_bpf_mknod(int *out, struct inode *dir, struct dentry *entry, umode_t mode, dev_t rdev)
+{
+ return 0;
+}
+
+static inline int fuse_bpf_mkdir(int *out, struct inode *dir, struct dentry *entry, umode_t mode)
+{
+ return 0;
+}
+
+static inline int fuse_bpf_rmdir(int *out, struct inode *dir, struct dentry *entry)
+{
+ return 0;
+}
+
+static inline int fuse_bpf_unlink(int *out, struct inode *dir, struct dentry *entry)
+{
+ return 0;
+}
+
static inline int fuse_bpf_release(int *out, struct inode *inode, struct file *file)
{
return 0;
--
2.40.0.634.g4ca3ef3211-goog

2023-04-18 01:43:56

by Daniel Rosenberg

[permalink] [raw]
Subject: [RFC PATCH v3 20/37] fuse-bpf: Add Rename support

This adds backing support for FUSE_RENAME and FUSE_RENAME2

Signed-off-by: Daniel Rosenberg <[email protected]>
Signed-off-by: Paul Lawrence <[email protected]>
---
fs/fuse/backing.c | 250 ++++++++++++++++++++++++++++++++++++++++++++++
fs/fuse/dir.c | 7 ++
fs/fuse/fuse_i.h | 18 ++++
3 files changed, 275 insertions(+)

diff --git a/fs/fuse/backing.c b/fs/fuse/backing.c
index 30492f7b2a05..d3a706b55905 100644
--- a/fs/fuse/backing.c
+++ b/fs/fuse/backing.c
@@ -1747,6 +1747,256 @@ int fuse_bpf_rmdir(int *out, struct inode *dir, struct dentry *entry)
dir, entry);
}

+static int fuse_rename_backing_common(struct inode *olddir,
+ struct dentry *oldent,
+ struct inode *newdir,
+ struct dentry *newent, unsigned int flags)
+{
+ int err = 0;
+ struct path old_backing_path;
+ struct path new_backing_path;
+ struct dentry *old_backing_dir_dentry;
+ struct dentry *old_backing_dentry;
+ struct dentry *new_backing_dir_dentry;
+ struct dentry *new_backing_dentry;
+ struct dentry *trap = NULL;
+ struct inode *target_inode;
+ struct renamedata rd;
+
+ //TODO Actually deal with changing anything that isn't a flag
+ get_fuse_backing_path(oldent, &old_backing_path);
+ if (!old_backing_path.dentry)
+ return -EBADF;
+ get_fuse_backing_path(newent, &new_backing_path);
+ if (!new_backing_path.dentry) {
+ /*
+ * TODO A file being moved from a backing path to another
+ * backing path which is not yet instrumented with FUSE-BPF.
+ * This may be slow and should be substituted with something
+ * more clever.
+ */
+ err = -EXDEV;
+ goto put_old_path;
+ }
+ if (new_backing_path.mnt != old_backing_path.mnt) {
+ err = -EXDEV;
+ goto put_new_path;
+ }
+ old_backing_dentry = old_backing_path.dentry;
+ new_backing_dentry = new_backing_path.dentry;
+ old_backing_dir_dentry = dget_parent(old_backing_dentry);
+ new_backing_dir_dentry = dget_parent(new_backing_dentry);
+ target_inode = d_inode(newent);
+
+ trap = lock_rename(old_backing_dir_dentry, new_backing_dir_dentry);
+ if (trap == old_backing_dentry) {
+ err = -EINVAL;
+ goto put_parents;
+ }
+ if (trap == new_backing_dentry) {
+ err = -ENOTEMPTY;
+ goto put_parents;
+ }
+
+ rd = (struct renamedata) {
+ .old_mnt_idmap = &nop_mnt_idmap,
+ .old_dir = d_inode(old_backing_dir_dentry),
+ .old_dentry = old_backing_dentry,
+ .new_mnt_idmap = &nop_mnt_idmap,
+ .new_dir = d_inode(new_backing_dir_dentry),
+ .new_dentry = new_backing_dentry,
+ .flags = flags,
+ };
+ err = vfs_rename(&rd);
+ if (err)
+ goto unlock;
+ if (target_inode)
+ fsstack_copy_attr_all(target_inode,
+ get_fuse_inode(target_inode)->backing_inode);
+ fsstack_copy_attr_all(d_inode(oldent), d_inode(old_backing_dentry));
+unlock:
+ unlock_rename(old_backing_dir_dentry, new_backing_dir_dentry);
+put_parents:
+ dput(new_backing_dir_dentry);
+ dput(old_backing_dir_dentry);
+put_new_path:
+ path_put(&new_backing_path);
+put_old_path:
+ path_put(&old_backing_path);
+ return err;
+}
+
+struct fuse_rename2_args {
+ struct fuse_rename2_in in;
+ struct fuse_buffer old_name;
+ struct fuse_buffer new_name;
+};
+
+static int fuse_rename2_initialize_in(struct bpf_fuse_args *fa, struct fuse_rename2_args *args,
+ struct inode *olddir, struct dentry *oldent,
+ struct inode *newdir, struct dentry *newent,
+ unsigned int flags)
+{
+ *args = (struct fuse_rename2_args) {
+ .in = (struct fuse_rename2_in) {
+ .newdir = get_node_id(newdir),
+ .flags = flags,
+ },
+ .old_name = (struct fuse_buffer) {
+ .data = (void *) oldent->d_name.name,
+ .size = oldent->d_name.len + 1,
+ .flags = BPF_FUSE_IMMUTABLE,
+ },
+ .new_name = (struct fuse_buffer) {
+ .data = (void *) newent->d_name.name,
+ .size = newent->d_name.len + 1,
+ .flags = BPF_FUSE_IMMUTABLE,
+ },
+
+ };
+ *fa = (struct bpf_fuse_args) {
+ .info = (struct bpf_fuse_meta_info) {
+ .nodeid = get_node_id(olddir),
+ .opcode = FUSE_RENAME2,
+ },
+ .in_numargs = 3,
+ .in_args[0] = (struct bpf_fuse_arg) {
+ .size = sizeof(args->in),
+ .value = &args->in,
+ },
+ .in_args[1] = (struct bpf_fuse_arg) {
+ .is_buffer = true,
+ .buffer = &args->old_name,
+ },
+ .in_args[2] = (struct bpf_fuse_arg) {
+ .is_buffer = true,
+ .buffer = &args->new_name,
+ },
+ };
+
+ return 0;
+}
+
+static int fuse_rename2_initialize_out(struct bpf_fuse_args *fa, struct fuse_rename2_args *args,
+ struct inode *olddir, struct dentry *oldent,
+ struct inode *newdir, struct dentry *newent,
+ unsigned int flags)
+{
+ return 0;
+}
+
+static int fuse_rename2_backing(struct bpf_fuse_args *fa, int *out,
+ struct inode *olddir, struct dentry *oldent,
+ struct inode *newdir, struct dentry *newent,
+ unsigned int flags)
+{
+ const struct fuse_rename2_args *fri = fa->in_args[0].value;
+
+ /* TODO: deal with changing dirs/ents */
+ *out = fuse_rename_backing_common(olddir, oldent, newdir, newent,
+ fri->in.flags);
+ return *out;
+}
+
+static int fuse_rename2_finalize(struct bpf_fuse_args *fa, int *out,
+ struct inode *olddir, struct dentry *oldent,
+ struct inode *newdir, struct dentry *newent,
+ unsigned int flags)
+{
+ return 0;
+}
+
+int fuse_bpf_rename2(int *out, struct inode *olddir, struct dentry *oldent,
+ struct inode *newdir, struct dentry *newent,
+ unsigned int flags)
+{
+ return bpf_fuse_backing(olddir, struct fuse_rename2_args, out,
+ fuse_rename2_initialize_in, fuse_rename2_initialize_out,
+ fuse_rename2_backing, fuse_rename2_finalize,
+ olddir, oldent, newdir, newent, flags);
+}
+
+struct fuse_rename_args {
+ struct fuse_rename_in in;
+ struct fuse_buffer old_name;
+ struct fuse_buffer new_name;
+};
+
+static int fuse_rename_initialize_in(struct bpf_fuse_args *fa, struct fuse_rename_args *args,
+ struct inode *olddir, struct dentry *oldent,
+ struct inode *newdir, struct dentry *newent)
+{
+ *args = (struct fuse_rename_args) {
+ .in = (struct fuse_rename_in) {
+ .newdir = get_node_id(newdir),
+ },
+ .old_name = (struct fuse_buffer) {
+ .data = (void *) oldent->d_name.name,
+ .size = oldent->d_name.len + 1,
+ .flags = BPF_FUSE_IMMUTABLE,
+ },
+ .new_name = (struct fuse_buffer) {
+ .data = (void *) newent->d_name.name,
+ .size = newent->d_name.len + 1,
+ .flags = BPF_FUSE_IMMUTABLE,
+ },
+
+ };
+ *fa = (struct bpf_fuse_args) {
+ .info = (struct bpf_fuse_meta_info) {
+ .nodeid = get_node_id(olddir),
+ .opcode = FUSE_RENAME,
+ },
+ .in_numargs = 3,
+ .in_args[0] = (struct bpf_fuse_arg) {
+ .size = sizeof(args->in),
+ .value = &args->in,
+ },
+ .in_args[1] = (struct bpf_fuse_arg) {
+ .is_buffer = true,
+ .buffer = &args->old_name,
+ },
+ .in_args[2] = (struct bpf_fuse_arg) {
+ .is_buffer = true,
+ .buffer = &args->new_name,
+ },
+ };
+
+ return 0;
+}
+
+static int fuse_rename_initialize_out(struct bpf_fuse_args *fa, struct fuse_rename_args *args,
+ struct inode *olddir, struct dentry *oldent,
+ struct inode *newdir, struct dentry *newent)
+{
+ return 0;
+}
+
+static int fuse_rename_backing(struct bpf_fuse_args *fa, int *out,
+ struct inode *olddir, struct dentry *oldent,
+ struct inode *newdir, struct dentry *newent)
+{
+ /* TODO: deal with changing dirs/ents */
+ *out = fuse_rename_backing_common(olddir, oldent, newdir, newent, 0);
+ return *out;
+}
+
+static int fuse_rename_finalize(struct bpf_fuse_args *fa, int *out,
+ struct inode *olddir, struct dentry *oldent,
+ struct inode *newdir, struct dentry *newent)
+{
+ return 0;
+}
+
+int fuse_bpf_rename(int *out, struct inode *olddir, struct dentry *oldent,
+ struct inode *newdir, struct dentry *newent)
+{
+ return bpf_fuse_backing(olddir, struct fuse_rename_args, out,
+ fuse_rename_initialize_in, fuse_rename_initialize_out,
+ fuse_rename_backing, fuse_rename_finalize,
+ olddir, oldent, newdir, newent);
+}
+
static int fuse_unlink_initialize_in(struct bpf_fuse_args *fa, struct fuse_buffer *name,
struct inode *dir, struct dentry *entry)
{
diff --git a/fs/fuse/dir.c b/fs/fuse/dir.c
index 5ce65f696980..086e3ecada19 100644
--- a/fs/fuse/dir.c
+++ b/fs/fuse/dir.c
@@ -1184,6 +1184,10 @@ static int fuse_rename2(struct mnt_idmap *idmap, struct inode *olddir,
return -EINVAL;

if (flags) {
+ if (fuse_bpf_rename2(&err, olddir, oldent, newdir, newent, flags))
+ return err;
+
+ /* TODO: how should this go with bpfs involved? */
if (fc->no_rename2 || fc->minor < 23)
return -EINVAL;

@@ -1195,6 +1199,9 @@ static int fuse_rename2(struct mnt_idmap *idmap, struct inode *olddir,
err = -EINVAL;
}
} else {
+ if (fuse_bpf_rename(&err, olddir, oldent, newdir, newent))
+ return err;
+
err = fuse_rename_common(olddir, oldent, newdir, newent, 0,
FUSE_RENAME,
sizeof(struct fuse_rename_in));
diff --git a/fs/fuse/fuse_i.h b/fs/fuse/fuse_i.h
index e60207bf66de..5c8bd2f76fb9 100644
--- a/fs/fuse/fuse_i.h
+++ b/fs/fuse/fuse_i.h
@@ -1411,6 +1411,11 @@ int fuse_bpf_create_open(int *out, struct inode *dir, struct dentry *entry,
int fuse_bpf_mknod(int *out, struct inode *dir, struct dentry *entry, umode_t mode, dev_t rdev);
int fuse_bpf_mkdir(int *out, struct inode *dir, struct dentry *entry, umode_t mode);
int fuse_bpf_rmdir(int *out, struct inode *dir, struct dentry *entry);
+int fuse_bpf_rename2(int *out, struct inode *olddir, struct dentry *oldent,
+ struct inode *newdir, struct dentry *newent,
+ unsigned int flags);
+int fuse_bpf_rename(int *out, struct inode *olddir, struct dentry *oldent,
+ struct inode *newdir, struct dentry *newent);
int fuse_bpf_unlink(int *out, struct inode *dir, struct dentry *entry);
int fuse_bpf_release(int *out, struct inode *inode, struct file *file);
int fuse_bpf_releasedir(int *out, struct inode *inode, struct file *file);
@@ -1453,6 +1458,19 @@ static inline int fuse_bpf_rmdir(int *out, struct inode *dir, struct dentry *ent
return 0;
}

+static inline int fuse_bpf_rename2(int *out, struct inode *olddir, struct dentry *oldent,
+ struct inode *newdir, struct dentry *newent,
+ unsigned int flags)
+{
+ return 0;
+}
+
+static inline int fuse_bpf_rename(int *out, struct inode *olddir, struct dentry *oldent,
+ struct inode *newdir, struct dentry *newent)
+{
+ return 0;
+}
+
static inline int fuse_bpf_unlink(int *out, struct inode *dir, struct dentry *entry)
{
return 0;
--
2.40.0.634.g4ca3ef3211-goog

2023-04-18 01:44:09

by Daniel Rosenberg

[permalink] [raw]
Subject: [RFC PATCH v3 17/37] fuse-bpf: Add support for read/write iter

Adds backing support for FUSE_READ and FUSE_WRITE

This includes adjustments from Amir Goldstein's patch to FUSE
Passthrough

Signed-off-by: Daniel Rosenberg <[email protected]>
Signed-off-by: Paul Lawrence <[email protected]>
---
fs/fuse/backing.c | 371 ++++++++++++++++++++++++++++++++++++++
fs/fuse/control.c | 2 +-
fs/fuse/file.c | 8 +
fs/fuse/fuse_i.h | 19 +-
fs/fuse/inode.c | 13 ++
include/uapi/linux/fuse.h | 10 +
6 files changed, 421 insertions(+), 2 deletions(-)

diff --git a/fs/fuse/backing.c b/fs/fuse/backing.c
index c6ef10aeec15..c7709a880e9c 100644
--- a/fs/fuse/backing.c
+++ b/fs/fuse/backing.c
@@ -11,6 +11,7 @@
#include <linux/file.h>
#include <linux/fs_stack.h>
#include <linux/namei.h>
+#include <linux/uio.h>

/*
* expression statement to wrap the backing filter logic
@@ -76,6 +77,89 @@
handled; \
})

+#define FUSE_BPF_IOCB_MASK (IOCB_APPEND | IOCB_DSYNC | IOCB_HIPRI | IOCB_NOWAIT | IOCB_SYNC)
+
+struct fuse_bpf_aio_req {
+ struct kiocb iocb;
+ refcount_t ref;
+ struct kiocb *iocb_orig;
+ struct timespec64 pre_atime;
+};
+
+static struct kmem_cache *fuse_bpf_aio_request_cachep;
+
+static void fuse_file_accessed(struct file *dst_file, struct file *src_file)
+{
+ struct inode *dst_inode;
+ struct inode *src_inode;
+
+ if (dst_file->f_flags & O_NOATIME)
+ return;
+
+ dst_inode = file_inode(dst_file);
+ src_inode = file_inode(src_file);
+
+ if ((!timespec64_equal(&dst_inode->i_mtime, &src_inode->i_mtime) ||
+ !timespec64_equal(&dst_inode->i_ctime, &src_inode->i_ctime))) {
+ dst_inode->i_mtime = src_inode->i_mtime;
+ dst_inode->i_ctime = src_inode->i_ctime;
+ }
+
+ touch_atime(&dst_file->f_path);
+}
+
+static void fuse_copyattr(struct file *dst_file, struct file *src_file)
+{
+ struct inode *dst = file_inode(dst_file);
+ struct inode *src = file_inode(src_file);
+
+ dst->i_atime = src->i_atime;
+ dst->i_mtime = src->i_mtime;
+ dst->i_ctime = src->i_ctime;
+ i_size_write(dst, i_size_read(src));
+ fuse_invalidate_attr(dst);
+}
+
+static void fuse_file_start_write(struct file *fuse_file, struct file *backing_file,
+ loff_t pos, size_t count)
+{
+ struct inode *inode = file_inode(fuse_file);
+ struct fuse_inode *fi = get_fuse_inode(inode);
+
+ if (inode->i_size < pos + count)
+ set_bit(FUSE_I_SIZE_UNSTABLE, &fi->state);
+
+ file_start_write(backing_file);
+}
+
+static void fuse_file_end_write(struct file *fuse_file, struct file *backing_file,
+ loff_t pos, size_t res)
+{
+ struct inode *inode = file_inode(fuse_file);
+ struct fuse_inode *fi = get_fuse_inode(inode);
+
+ file_end_write(backing_file);
+
+ if (res > 0)
+ fuse_write_update_attr(inode, pos, res);
+
+ clear_bit(FUSE_I_SIZE_UNSTABLE, &fi->state);
+ fuse_invalidate_attr(inode);
+}
+
+static void fuse_file_start_read(struct file *backing_file, struct timespec64 *pre_atime)
+{
+ *pre_atime = file_inode(backing_file)->i_atime;
+}
+
+static void fuse_file_end_read(struct file *fuse_file, struct file *backing_file,
+ struct timespec64 *pre_atime)
+{
+ /* Mimic atime update policy of passthrough inode, not the value */
+ if (!timespec64_equal(&file_inode(backing_file)->i_atime, pre_atime))
+ fuse_invalidate_atime(file_inode(fuse_file));
+}
+
static void fuse_get_backing_path(struct file *file, struct path *path)
{
path_get(&file->f_path);
@@ -664,6 +748,277 @@ int fuse_bpf_lseek(loff_t *out, struct inode *inode, struct file *file, loff_t o
file, offset, whence);
}

+static inline void fuse_bpf_aio_put(struct fuse_bpf_aio_req *aio_req)
+{
+ if (refcount_dec_and_test(&aio_req->ref))
+ kmem_cache_free(fuse_bpf_aio_request_cachep, aio_req);
+}
+
+static void fuse_bpf_aio_cleanup_handler(struct fuse_bpf_aio_req *aio_req, long res)
+{
+ struct kiocb *iocb = &aio_req->iocb;
+ struct kiocb *iocb_orig = aio_req->iocb_orig;
+ struct file *filp = iocb->ki_filp;
+ struct file *fuse_filp = iocb_orig->ki_filp;
+
+ if (iocb->ki_flags & IOCB_WRITE) {
+ __sb_writers_acquired(file_inode(iocb->ki_filp)->i_sb,
+ SB_FREEZE_WRITE);
+ fuse_file_end_write(iocb_orig->ki_filp, iocb->ki_filp, iocb->ki_pos, res);
+ } else {
+ fuse_file_end_read(fuse_filp, filp, &aio_req->pre_atime);
+ }
+ iocb_orig->ki_pos = iocb->ki_pos;
+ fuse_bpf_aio_put(aio_req);
+}
+
+static void fuse_bpf_aio_rw_complete(struct kiocb *iocb, long res)
+{
+ struct fuse_bpf_aio_req *aio_req =
+ container_of(iocb, struct fuse_bpf_aio_req, iocb);
+ struct kiocb *iocb_orig = aio_req->iocb_orig;
+
+ fuse_bpf_aio_cleanup_handler(aio_req, res);
+ iocb_orig->ki_complete(iocb_orig, res);
+}
+
+struct fuse_file_read_iter_args {
+ struct fuse_read_in in;
+ struct fuse_read_iter_out out;
+};
+
+static int fuse_file_read_iter_initialize_in(struct bpf_fuse_args *fa, struct fuse_file_read_iter_args *args,
+ struct kiocb *iocb, struct iov_iter *to)
+{
+ struct file *file = iocb->ki_filp;
+ struct fuse_file *ff = file->private_data;
+
+ args->in = (struct fuse_read_in) {
+ .fh = ff->fh,
+ .offset = iocb->ki_pos,
+ .size = to->count,
+ };
+
+ /* TODO we can't assume 'to' is a kvec */
+ /* TODO we also can't assume the vector has only one component */
+ *fa = (struct bpf_fuse_args) {
+ .info = (struct bpf_fuse_meta_info) {
+ .opcode = FUSE_READ,
+ .nodeid = ff->nodeid,
+ }, .in_numargs = 1,
+ .in_args[0].size = sizeof(args->in),
+ .in_args[0].value = &args->in,
+ /*
+ * TODO Design this properly.
+ * Possible approach: do not pass buf to bpf
+ * If going to userland, do a deep copy
+ * For extra credit, do that to/from the vector, rather than
+ * making an extra copy in the kernel
+ */
+ };
+
+ return 0;
+}
+
+static int fuse_file_read_iter_initialize_out(struct bpf_fuse_args *fa, struct fuse_file_read_iter_args *args,
+ struct kiocb *iocb, struct iov_iter *to)
+{
+ args->out = (struct fuse_read_iter_out) {
+ .ret = args->in.size,
+ };
+
+ fa->out_numargs = 1;
+ fa->out_args[0].size = sizeof(args->out);
+ fa->out_args[0].value = &args->out;
+
+ return 0;
+}
+
+static int fuse_file_read_iter_backing(struct bpf_fuse_args *fa, ssize_t *out,
+ struct kiocb *iocb, struct iov_iter *to)
+{
+ struct fuse_read_iter_out *frio = fa->out_args[0].value;
+ struct file *file = iocb->ki_filp;
+ struct fuse_file *ff = file->private_data;
+
+ if (!iov_iter_count(to))
+ return 0;
+
+ if ((iocb->ki_flags & IOCB_DIRECT) &&
+ (!ff->backing_file->f_mapping->a_ops ||
+ !ff->backing_file->f_mapping->a_ops->direct_IO))
+ return -EINVAL;
+
+ /* TODO This just plain ignores any change to fuse_read_in */
+ if (is_sync_kiocb(iocb)) {
+ struct timespec64 pre_atime;
+
+ fuse_file_start_read(ff->backing_file, &pre_atime);
+ *out = vfs_iter_read(ff->backing_file, to, &iocb->ki_pos,
+ iocb_to_rw_flags(iocb->ki_flags, FUSE_BPF_IOCB_MASK));
+ fuse_file_end_read(file, ff->backing_file, &pre_atime);
+ } else {
+ struct fuse_bpf_aio_req *aio_req;
+
+ *out = -ENOMEM;
+ aio_req = kmem_cache_zalloc(fuse_bpf_aio_request_cachep, GFP_KERNEL);
+ if (!aio_req)
+ goto out;
+
+ aio_req->iocb_orig = iocb;
+ fuse_file_start_read(ff->backing_file, &aio_req->pre_atime);
+ kiocb_clone(&aio_req->iocb, iocb, ff->backing_file);
+ aio_req->iocb.ki_complete = fuse_bpf_aio_rw_complete;
+ refcount_set(&aio_req->ref, 2);
+ *out = vfs_iocb_iter_read(ff->backing_file, &aio_req->iocb, to);
+ fuse_bpf_aio_put(aio_req);
+ if (*out != -EIOCBQUEUED)
+ fuse_bpf_aio_cleanup_handler(aio_req, *out);
+ }
+
+ frio->ret = *out;
+
+ /* TODO Need to point value at the buffer for post-modification */
+
+out:
+ fuse_file_accessed(file, ff->backing_file);
+
+ return *out;
+}
+
+static int fuse_file_read_iter_finalize(struct bpf_fuse_args *fa, ssize_t *out,
+ struct kiocb *iocb, struct iov_iter *to)
+{
+ struct fuse_read_iter_out *frio = fa->out_args[0].value;
+
+ *out = frio->ret;
+
+ return 0;
+}
+
+int fuse_bpf_file_read_iter(ssize_t *out, struct inode *inode, struct kiocb *iocb, struct iov_iter *to)
+{
+ return bpf_fuse_backing(inode, struct fuse_file_read_iter_args, out,
+ fuse_file_read_iter_initialize_in,
+ fuse_file_read_iter_initialize_out,
+ fuse_file_read_iter_backing,
+ fuse_file_read_iter_finalize,
+ iocb, to);
+}
+
+struct fuse_file_write_iter_args {
+ struct fuse_write_in in;
+ struct fuse_write_iter_out out;
+};
+
+static int fuse_file_write_iter_initialize_in(struct bpf_fuse_args *fa,
+ struct fuse_file_write_iter_args *args,
+ struct kiocb *iocb, struct iov_iter *from)
+{
+ struct file *file = iocb->ki_filp;
+ struct fuse_file *ff = file->private_data;
+
+ *args = (struct fuse_file_write_iter_args) {
+ .in.fh = ff->fh,
+ .in.offset = iocb->ki_pos,
+ .in.size = from->count,
+ };
+
+ /* TODO we can't assume 'from' is a kvec */
+ *fa = (struct bpf_fuse_args) {
+ .info = (struct bpf_fuse_meta_info) {
+ .opcode = FUSE_WRITE,
+ .nodeid = ff->nodeid,
+ },
+ .in_numargs = 1,
+ .in_args[0].size = sizeof(args->in),
+ .in_args[0].value = &args->in,
+ };
+
+ return 0;
+}
+
+static int fuse_file_write_iter_initialize_out(struct bpf_fuse_args *fa,
+ struct fuse_file_write_iter_args *args,
+ struct kiocb *iocb, struct iov_iter *from)
+{
+ /* TODO we can't assume 'from' is a kvec */
+ fa->out_numargs = 1;
+ fa->out_args[0].size = sizeof(args->out);
+ fa->out_args[0].value = &args->out;
+
+ return 0;
+}
+
+static int fuse_file_write_iter_backing(struct bpf_fuse_args *fa, ssize_t *out,
+ struct kiocb *iocb, struct iov_iter *from)
+{
+ struct file *file = iocb->ki_filp;
+ struct fuse_file *ff = file->private_data;
+ struct fuse_write_iter_out *fwio = fa->out_args[0].value;
+ ssize_t count = iov_iter_count(from);
+
+ if (!count)
+ return 0;
+
+ /* TODO This just plain ignores any change to fuse_write_in */
+ /* TODO uint32_t seems smaller than ssize_t.... right? */
+ inode_lock(file_inode(file));
+
+ fuse_copyattr(file, ff->backing_file);
+
+ if (is_sync_kiocb(iocb)) {
+ fuse_file_start_write(file, ff->backing_file, iocb->ki_pos, count);
+ *out = vfs_iter_write(ff->backing_file, from, &iocb->ki_pos,
+ iocb_to_rw_flags(iocb->ki_flags, FUSE_BPF_IOCB_MASK));
+ fuse_file_end_write(file, ff->backing_file, iocb->ki_pos, *out);
+ } else {
+ struct fuse_bpf_aio_req *aio_req;
+
+ *out = -ENOMEM;
+ aio_req = kmem_cache_zalloc(fuse_bpf_aio_request_cachep, GFP_KERNEL);
+ if (!aio_req)
+ goto out;
+
+ fuse_file_start_write(file, ff->backing_file, iocb->ki_pos, count);
+ __sb_writers_release(file_inode(ff->backing_file)->i_sb, SB_FREEZE_WRITE);
+ aio_req->iocb_orig = iocb;
+ kiocb_clone(&aio_req->iocb, iocb, ff->backing_file);
+ aio_req->iocb.ki_complete = fuse_bpf_aio_rw_complete;
+ refcount_set(&aio_req->ref, 2);
+ *out = vfs_iocb_iter_write(ff->backing_file, &aio_req->iocb, from);
+ fuse_bpf_aio_put(aio_req);
+ if (*out != -EIOCBQUEUED)
+ fuse_bpf_aio_cleanup_handler(aio_req, *out);
+ }
+
+out:
+ inode_unlock(file_inode(file));
+ fwio->ret = *out;
+ if (*out < 0)
+ return *out;
+ return 0;
+}
+
+static int fuse_file_write_iter_finalize(struct bpf_fuse_args *fa, ssize_t *out,
+ struct kiocb *iocb, struct iov_iter *from)
+{
+ struct fuse_write_iter_out *fwio = fa->out_args[0].value;
+
+ *out = fwio->ret;
+ return 0;
+}
+
+int fuse_bpf_file_write_iter(ssize_t *out, struct inode *inode, struct kiocb *iocb, struct iov_iter *from)
+{
+ return bpf_fuse_backing(inode, struct fuse_file_write_iter_args, out,
+ fuse_file_write_iter_initialize_in,
+ fuse_file_write_iter_initialize_out,
+ fuse_file_write_iter_backing,
+ fuse_file_write_iter_finalize,
+ iocb, from);
+}
+
ssize_t fuse_backing_mmap(struct file *file, struct vm_area_struct *vma)
{
int ret;
@@ -1360,3 +1715,19 @@ int fuse_bpf_access(int *out, struct inode *inode, int mask)
fuse_access_initialize_in, fuse_access_initialize_out,
fuse_access_backing, fuse_access_finalize, inode, mask);
}
+
+int __init fuse_bpf_init(void)
+{
+ fuse_bpf_aio_request_cachep = kmem_cache_create("fuse_bpf_aio_req",
+ sizeof(struct fuse_bpf_aio_req),
+ 0, SLAB_HWCACHE_ALIGN, NULL);
+ if (!fuse_bpf_aio_request_cachep)
+ return -ENOMEM;
+
+ return 0;
+}
+
+void __exit fuse_bpf_cleanup(void)
+{
+ kmem_cache_destroy(fuse_bpf_aio_request_cachep);
+}
diff --git a/fs/fuse/control.c b/fs/fuse/control.c
index 247ef4f76761..685552453751 100644
--- a/fs/fuse/control.c
+++ b/fs/fuse/control.c
@@ -378,7 +378,7 @@ int __init fuse_ctl_init(void)
return register_filesystem(&fuse_ctl_fs_type);
}

-void __exit fuse_ctl_cleanup(void)
+void fuse_ctl_cleanup(void)
{
unregister_filesystem(&fuse_ctl_fs_type);
}
diff --git a/fs/fuse/file.c b/fs/fuse/file.c
index 1836d09d9ce3..5f19ef5bf124 100644
--- a/fs/fuse/file.c
+++ b/fs/fuse/file.c
@@ -1679,6 +1679,7 @@ static ssize_t fuse_file_read_iter(struct kiocb *iocb, struct iov_iter *to)
struct file *file = iocb->ki_filp;
struct fuse_file *ff = file->private_data;
struct inode *inode = file_inode(file);
+ ssize_t ret;

if (fuse_is_bad(inode))
return -EIO;
@@ -1686,6 +1687,9 @@ static ssize_t fuse_file_read_iter(struct kiocb *iocb, struct iov_iter *to)
if (FUSE_IS_DAX(inode))
return fuse_dax_read_iter(iocb, to);

+ if (fuse_bpf_file_read_iter(&ret, inode, iocb, to))
+ return ret;
+
if (!(ff->open_flags & FOPEN_DIRECT_IO))
return fuse_cache_read_iter(iocb, to);
else
@@ -1697,6 +1701,7 @@ static ssize_t fuse_file_write_iter(struct kiocb *iocb, struct iov_iter *from)
struct file *file = iocb->ki_filp;
struct fuse_file *ff = file->private_data;
struct inode *inode = file_inode(file);
+ ssize_t ret = 0;

if (fuse_is_bad(inode))
return -EIO;
@@ -1704,6 +1709,9 @@ static ssize_t fuse_file_write_iter(struct kiocb *iocb, struct iov_iter *from)
if (FUSE_IS_DAX(inode))
return fuse_dax_write_iter(iocb, from);

+ if (fuse_bpf_file_write_iter(&ret, inode, iocb, from))
+ return ret;
+
if (!(ff->open_flags & FOPEN_DIRECT_IO))
return fuse_cache_write_iter(iocb, from);
else
diff --git a/fs/fuse/fuse_i.h b/fs/fuse/fuse_i.h
index 2cbe232c1048..4bc070b81ac2 100644
--- a/fs/fuse/fuse_i.h
+++ b/fs/fuse/fuse_i.h
@@ -1135,7 +1135,7 @@ int fuse_dev_init(void);
void fuse_dev_cleanup(void);

int fuse_ctl_init(void);
-void __exit fuse_ctl_cleanup(void);
+void fuse_ctl_cleanup(void);

/**
* Simple request sending that does request allocation and freeing
@@ -1415,6 +1415,8 @@ int fuse_bpf_unlink(int *out, struct inode *dir, struct dentry *entry);
int fuse_bpf_release(int *out, struct inode *inode, struct file *file);
int fuse_bpf_releasedir(int *out, struct inode *inode, struct file *file);
int fuse_bpf_lseek(loff_t *out, struct inode *inode, struct file *file, loff_t offset, int whence);
+int fuse_bpf_file_read_iter(ssize_t *out, struct inode *inode, struct kiocb *iocb, struct iov_iter *to);
+int fuse_bpf_file_write_iter(ssize_t *out, struct inode *inode, struct kiocb *iocb, struct iov_iter *from);
int fuse_bpf_file_fallocate(int *out, struct inode *inode, struct file *file, int mode, loff_t offset, loff_t length);
int fuse_bpf_lookup(struct dentry **out, struct inode *dir, struct dentry *entry, unsigned int flags);
int fuse_bpf_access(int *out, struct inode *inode, int mask);
@@ -1467,6 +1469,16 @@ static inline int fuse_bpf_lseek(loff_t *out, struct inode *inode, struct file *
return 0;
}

+static inline int fuse_bpf_file_read_iter(ssize_t *out, struct inode *inode, struct kiocb *iocb, struct iov_iter *to)
+{
+ return 0;
+}
+
+static inline int fuse_bpf_file_write_iter(ssize_t *out, struct inode *inode, struct kiocb *iocb, struct iov_iter *from)
+{
+ return 0;
+}
+
static inline int fuse_bpf_file_fallocate(int *out, struct inode *inode, struct file *file, int mode, loff_t offset, loff_t length)
{
return 0;
@@ -1517,4 +1529,9 @@ static inline u64 attr_timeout(struct fuse_attr_out *o)
return time_to_jiffies(o->attr_valid, o->attr_valid_nsec);
}

+#ifdef CONFIG_FUSE_BPF
+int __init fuse_bpf_init(void);
+void __exit fuse_bpf_cleanup(void);
+#endif /* CONFIG_FUSE_BPF */
+
#endif /* _FS_FUSE_I_H */
diff --git a/fs/fuse/inode.c b/fs/fuse/inode.c
index fe80984f099a..fd00910f1eb1 100644
--- a/fs/fuse/inode.c
+++ b/fs/fuse/inode.c
@@ -2095,11 +2095,21 @@ static int __init fuse_init(void)
if (res)
goto err_sysfs_cleanup;

+#ifdef CONFIG_FUSE_BPF
+ res = fuse_bpf_init();
+ if (res)
+ goto err_ctl_cleanup;
+#endif
+
sanitize_global_limit(&max_user_bgreq);
sanitize_global_limit(&max_user_congthresh);

return 0;

+#ifdef CONFIG_FUSE_BPF
+ err_ctl_cleanup:
+ fuse_ctl_cleanup();
+#endif
err_sysfs_cleanup:
fuse_sysfs_cleanup();
err_dev_cleanup:
@@ -2117,6 +2127,9 @@ static void __exit fuse_exit(void)
fuse_ctl_cleanup();
fuse_sysfs_cleanup();
fuse_fs_cleanup();
+#ifdef CONFIG_FUSE_BPF
+ fuse_bpf_cleanup();
+#endif
fuse_dev_cleanup();
}

diff --git a/include/uapi/linux/fuse.h b/include/uapi/linux/fuse.h
index 3ad725a3e968..dbfc8d501bcb 100644
--- a/include/uapi/linux/fuse.h
+++ b/include/uapi/linux/fuse.h
@@ -748,6 +748,11 @@ struct fuse_read_in {
uint32_t padding;
};

+// This is likely not what we want
+struct fuse_read_iter_out {
+ uint64_t ret;
+};
+
#define FUSE_COMPAT_WRITE_IN_SIZE 24

struct fuse_write_in {
@@ -765,6 +770,11 @@ struct fuse_write_out {
uint32_t padding;
};

+// This is likely not what we want
+struct fuse_write_iter_out {
+ uint64_t ret;
+};
+
#define FUSE_COMPAT_STATFS_SIZE 48

struct fuse_statfs_out {
--
2.40.0.634.g4ca3ef3211-goog

2023-04-18 01:44:20

by Daniel Rosenberg

[permalink] [raw]
Subject: [RFC PATCH v3 18/37] fuse-bpf: support readdir

This adds backing support for FUSE_READDIR

Signed-off-by: Daniel Rosenberg <[email protected]>
Signed-off-by: Paul Lawrence <[email protected]>
---
fs/fuse/backing.c | 194 ++++++++++++++++++++++++++++++++++++++
fs/fuse/fuse_i.h | 6 ++
fs/fuse/readdir.c | 5 +
include/uapi/linux/fuse.h | 6 ++
4 files changed, 211 insertions(+)

diff --git a/fs/fuse/backing.c b/fs/fuse/backing.c
index c7709a880e9c..2908c231a695 100644
--- a/fs/fuse/backing.c
+++ b/fs/fuse/backing.c
@@ -1669,6 +1669,200 @@ int fuse_bpf_unlink(int *out, struct inode *dir, struct dentry *entry)
dir, entry);
}

+struct fuse_read_args {
+ struct fuse_read_in in;
+ struct fuse_read_out out;
+ struct fuse_buffer buffer;
+};
+
+static int fuse_readdir_initialize_in(struct bpf_fuse_args *fa, struct fuse_read_args *args,
+ struct file *file, struct dir_context *ctx,
+ bool *force_again, bool *allow_force, bool is_continued)
+{
+ struct fuse_file *ff = file->private_data;
+
+ *fa = (struct bpf_fuse_args) {
+ .info = (struct bpf_fuse_meta_info) {
+ .nodeid = ff->nodeid,
+ .opcode = FUSE_READDIR,
+ },
+ .in_numargs = 1,
+ .in_args[0] = (struct bpf_fuse_arg) {
+ .size = sizeof(args->in),
+ .value = &args->in,
+ },
+ };
+
+ args->in = (struct fuse_read_in) {
+ .fh = ff->fh,
+ .offset = ctx->pos,
+ .size = PAGE_SIZE,
+ };
+
+ *force_again = false;
+ *allow_force = true;
+ return 0;
+}
+
+static int fuse_readdir_initialize_out(struct bpf_fuse_args *fa, struct fuse_read_args *args,
+ struct file *file, struct dir_context *ctx,
+ bool *force_again, bool *allow_force, bool is_continued)
+{
+ u8 *page = (u8 *)__get_free_page(GFP_KERNEL);
+
+ if (!page)
+ return -ENOMEM;
+
+ fa->flags = FUSE_BPF_OUT_ARGVAR;
+ fa->out_numargs = 2;
+ fa->out_args[0] = (struct bpf_fuse_arg) {
+ .size = sizeof(args->out),
+ .value = &args->out,
+ };
+ fa->out_args[1] = (struct bpf_fuse_arg) {
+ .is_buffer = true,
+ .buffer = &args->buffer,
+ };
+ args->out = (struct fuse_read_out) {
+ .again = 0,
+ .offset = 0,
+ };
+ args->buffer = (struct fuse_buffer) {
+ .data = page,
+ .size = PAGE_SIZE,
+ .alloc_size = PAGE_SIZE,
+ .max_size = PAGE_SIZE,
+ .flags = BPF_FUSE_VARIABLE_SIZE,
+ };
+
+ return 0;
+}
+
+struct fusebpf_ctx {
+ struct dir_context ctx;
+ u8 *addr;
+ size_t offset;
+};
+
+static bool filldir(struct dir_context *ctx, const char *name, int namelen,
+ loff_t offset, u64 ino, unsigned int d_type)
+{
+ struct fusebpf_ctx *ec = container_of(ctx, struct fusebpf_ctx, ctx);
+ struct fuse_dirent *fd = (struct fuse_dirent *)(ec->addr + ec->offset);
+
+ if (ec->offset + sizeof(struct fuse_dirent) + namelen > PAGE_SIZE)
+ return false;
+
+ *fd = (struct fuse_dirent) {
+ .ino = ino,
+ .off = offset,
+ .namelen = namelen,
+ .type = d_type,
+ };
+
+ memcpy(fd->name, name, namelen);
+ ec->offset += FUSE_DIRENT_SIZE(fd);
+
+ return true;
+}
+
+static int parse_dirfile(char *buf, size_t nbytes, struct dir_context *ctx)
+{
+ while (nbytes >= FUSE_NAME_OFFSET) {
+ struct fuse_dirent *dirent = (struct fuse_dirent *) buf;
+ size_t reclen = FUSE_DIRENT_SIZE(dirent);
+
+ if (!dirent->namelen || dirent->namelen > FUSE_NAME_MAX)
+ return -EIO;
+ if (reclen > nbytes)
+ break;
+ if (memchr(dirent->name, '/', dirent->namelen) != NULL)
+ return -EIO;
+
+ ctx->pos = dirent->off;
+ if (!dir_emit(ctx, dirent->name, dirent->namelen, dirent->ino,
+ dirent->type))
+ break;
+
+ buf += reclen;
+ nbytes -= reclen;
+ }
+
+ return 0;
+}
+
+static int fuse_readdir_backing(struct bpf_fuse_args *fa, int *out,
+ struct file *file, struct dir_context *ctx,
+ bool *force_again, bool *allow_force, bool is_continued)
+{
+ struct fuse_file *ff = file->private_data;
+ struct file *backing_dir = ff->backing_file;
+ struct fuse_read_out *fro = fa->out_args[0].value;
+ struct fusebpf_ctx ec;
+
+ ec = (struct fusebpf_ctx) {
+ .ctx.actor = filldir,
+ .ctx.pos = ctx->pos,
+ .addr = fa->out_args[1].buffer->data,
+ };
+
+ if (!ec.addr)
+ return -ENOMEM;
+
+ if (!is_continued)
+ backing_dir->f_pos = file->f_pos;
+
+ *out = iterate_dir(backing_dir, &ec.ctx);
+ if (ec.offset == 0)
+ *allow_force = false;
+ fa->out_args[1].buffer->size = ec.offset;
+
+ fro->offset = ec.ctx.pos;
+ fro->again = false;
+
+ return *out;
+}
+
+static int fuse_readdir_finalize(struct bpf_fuse_args *fa, int *out,
+ struct file *file, struct dir_context *ctx,
+ bool *force_again, bool *allow_force, bool is_continued)
+{
+ struct fuse_read_out *fro = fa->out_args[0].value;
+ struct fuse_file *ff = file->private_data;
+ struct file *backing_dir = ff->backing_file;
+
+ *out = parse_dirfile(fa->out_args[1].buffer->data, fa->out_args[1].buffer->size, ctx);
+ *force_again = !!fro->again;
+ if (*force_again && !*allow_force)
+ *out = -EINVAL;
+
+ ctx->pos = fro->offset;
+ backing_dir->f_pos = fro->offset;
+
+ free_page((unsigned long)fa->out_args[1].buffer->data);
+ return *out;
+}
+
+int fuse_bpf_readdir(int *out, struct inode *inode, struct file *file, struct dir_context *ctx)
+{
+ int ret;
+ bool allow_force;
+ bool force_again = false;
+ bool is_continued = false;
+
+again:
+ ret = bpf_fuse_backing(inode, struct fuse_read_args, out,
+ fuse_readdir_initialize_in, fuse_readdir_initialize_out,
+ fuse_readdir_backing, fuse_readdir_finalize,
+ file, ctx, &force_again, &allow_force, is_continued);
+ if (force_again && *out >= 0) {
+ is_continued = true;
+ goto again;
+ }
+
+ return ret;
+}
+
static int fuse_access_initialize_in(struct bpf_fuse_args *fa, struct fuse_access_in *in,
struct inode *inode, int mask)
{
diff --git a/fs/fuse/fuse_i.h b/fs/fuse/fuse_i.h
index 4bc070b81ac2..fb3a77b79b0f 100644
--- a/fs/fuse/fuse_i.h
+++ b/fs/fuse/fuse_i.h
@@ -1419,6 +1419,7 @@ int fuse_bpf_file_read_iter(ssize_t *out, struct inode *inode, struct kiocb *ioc
int fuse_bpf_file_write_iter(ssize_t *out, struct inode *inode, struct kiocb *iocb, struct iov_iter *from);
int fuse_bpf_file_fallocate(int *out, struct inode *inode, struct file *file, int mode, loff_t offset, loff_t length);
int fuse_bpf_lookup(struct dentry **out, struct inode *dir, struct dentry *entry, unsigned int flags);
+int fuse_bpf_readdir(int *out, struct inode *inode, struct file *file, struct dir_context *ctx);
int fuse_bpf_access(int *out, struct inode *inode, int mask);

#else
@@ -1489,6 +1490,11 @@ static inline int fuse_bpf_lookup(struct dentry **out, struct inode *dir, struct
return 0;
}

+static inline int fuse_bpf_readdir(int *out, struct inode *inode, struct file *file, struct dir_context *ctx)
+{
+ return 0;
+}
+
static inline int fuse_bpf_access(int *out, struct inode *inode, int mask)
{
return 0;
diff --git a/fs/fuse/readdir.c b/fs/fuse/readdir.c
index dc603479b30e..cc6548f314f2 100644
--- a/fs/fuse/readdir.c
+++ b/fs/fuse/readdir.c
@@ -20,6 +20,8 @@ static bool fuse_use_readdirplus(struct inode *dir, struct dir_context *ctx)

if (!fc->do_readdirplus)
return false;
+ if (fi->nodeid == 0)
+ return false;
if (!fc->readdirplus_auto)
return true;
if (test_and_clear_bit(FUSE_I_ADVISE_RDPLUS, &fi->state))
@@ -582,6 +584,9 @@ int fuse_readdir(struct file *file, struct dir_context *ctx)
if (fuse_is_bad(inode))
return -EIO;

+ if (fuse_bpf_readdir(&err, inode, file, ctx))
+ return err;
+
mutex_lock(&ff->readdir.lock);

err = UNCACHED;
diff --git a/include/uapi/linux/fuse.h b/include/uapi/linux/fuse.h
index dbfc8d501bcb..e779064f5fad 100644
--- a/include/uapi/linux/fuse.h
+++ b/include/uapi/linux/fuse.h
@@ -748,6 +748,12 @@ struct fuse_read_in {
uint32_t padding;
};

+struct fuse_read_out {
+ uint64_t offset;
+ uint32_t again;
+ uint32_t padding;
+};
+
// This is likely not what we want
struct fuse_read_iter_out {
uint64_t ret;
--
2.40.0.634.g4ca3ef3211-goog

2023-04-18 01:44:21

by Daniel Rosenberg

[permalink] [raw]
Subject: [RFC PATCH v3 19/37] fuse-bpf: Add support for sync operations

This adds backing support for FUSE_FLUSH, FUSE_FSYNC, and FUSE_FSYNCDIR.

Signed-off-by: Daniel Rosenberg <[email protected]>
Signed-off-by: Paul Lawrence <[email protected]>
---
fs/fuse/backing.c | 147 ++++++++++++++++++++++++++++++++++++++++++++++
fs/fuse/dir.c | 3 +
fs/fuse/file.c | 7 +++
fs/fuse/fuse_i.h | 18 ++++++
4 files changed, 175 insertions(+)

diff --git a/fs/fuse/backing.c b/fs/fuse/backing.c
index 2908c231a695..30492f7b2a05 100644
--- a/fs/fuse/backing.c
+++ b/fs/fuse/backing.c
@@ -659,6 +659,59 @@ int fuse_bpf_releasedir(int *out, struct inode *inode, struct file *file)
fuse_release_backing, fuse_release_finalize, inode, file);
}

+static int fuse_flush_initialize_in(struct bpf_fuse_args *fa, struct fuse_flush_in *ffi,
+ struct file *file, fl_owner_t id)
+{
+ struct fuse_file *fuse_file = file->private_data;
+
+ *ffi = (struct fuse_flush_in) {
+ .fh = fuse_file->fh,
+ };
+
+ *fa = (struct bpf_fuse_args) {
+ .info = (struct bpf_fuse_meta_info) {
+ .nodeid = get_node_id(file->f_inode),
+ .opcode = FUSE_FLUSH,
+ },
+ .in_numargs = 1,
+ .in_args[0].size = sizeof(*ffi),
+ .in_args[0].value = ffi,
+ .flags = FUSE_BPF_FORCE,
+ };
+
+ return 0;
+}
+
+static int fuse_flush_initialize_out(struct bpf_fuse_args *fa, struct fuse_flush_in *ffi,
+ struct file *file, fl_owner_t id)
+{
+ return 0;
+}
+
+static int fuse_flush_backing(struct bpf_fuse_args *fa, int *out, struct file *file, fl_owner_t id)
+{
+ struct fuse_file *fuse_file = file->private_data;
+ struct file *backing_file = fuse_file->backing_file;
+
+ *out = 0;
+ if (backing_file->f_op->flush)
+ *out = backing_file->f_op->flush(backing_file, id);
+ return *out;
+}
+
+static int fuse_flush_finalize(struct bpf_fuse_args *fa, int *out, struct file *file, fl_owner_t id)
+{
+ return 0;
+}
+
+int fuse_bpf_flush(int *out, struct inode *inode, struct file *file, fl_owner_t id)
+{
+ return bpf_fuse_backing(inode, struct fuse_flush_in, out,
+ fuse_flush_initialize_in, fuse_flush_initialize_out,
+ fuse_flush_backing, fuse_flush_finalize,
+ file, id);
+}
+
struct fuse_lseek_args {
struct fuse_lseek_in in;
struct fuse_lseek_out out;
@@ -748,6 +801,100 @@ int fuse_bpf_lseek(loff_t *out, struct inode *inode, struct file *file, loff_t o
file, offset, whence);
}

+static int fuse_fsync_initialize_in(struct bpf_fuse_args *fa, struct fuse_fsync_in *in,
+ struct file *file, loff_t start, loff_t end, int datasync)
+{
+ struct fuse_file *fuse_file = file->private_data;
+
+ *in = (struct fuse_fsync_in) {
+ .fh = fuse_file->fh,
+ .fsync_flags = datasync ? FUSE_FSYNC_FDATASYNC : 0,
+ };
+
+ *fa = (struct bpf_fuse_args) {
+ .info = (struct bpf_fuse_meta_info) {
+ .nodeid = get_fuse_inode(file->f_inode)->nodeid,
+ .opcode = FUSE_FSYNC,
+ },
+ .in_numargs = 1,
+ .in_args[0].size = sizeof(*in),
+ .in_args[0].value = in,
+ .flags = FUSE_BPF_FORCE,
+ };
+
+ return 0;
+}
+
+static int fuse_fsync_initialize_out(struct bpf_fuse_args *fa, struct fuse_fsync_in *ffi,
+ struct file *file, loff_t start, loff_t end, int datasync)
+{
+ return 0;
+}
+
+static int fuse_fsync_backing(struct bpf_fuse_args *fa, int *out,
+ struct file *file, loff_t start, loff_t end, int datasync)
+{
+ struct fuse_file *fuse_file = file->private_data;
+ struct file *backing_file = fuse_file->backing_file;
+ const struct fuse_fsync_in *ffi = fa->in_args[0].value;
+ int new_datasync = (ffi->fsync_flags & FUSE_FSYNC_FDATASYNC) ? 1 : 0;
+
+ *out = vfs_fsync(backing_file, new_datasync);
+ return 0;
+}
+
+static int fuse_fsync_finalize(struct bpf_fuse_args *fa, int *out,
+ struct file *file, loff_t start, loff_t end, int datasync)
+{
+ return 0;
+}
+
+int fuse_bpf_fsync(int *out, struct inode *inode, struct file *file, loff_t start, loff_t end, int datasync)
+{
+ return bpf_fuse_backing(inode, struct fuse_fsync_in, out,
+ fuse_fsync_initialize_in, fuse_fsync_initialize_out,
+ fuse_fsync_backing, fuse_fsync_finalize,
+ file, start, end, datasync);
+}
+
+static int fuse_dir_fsync_initialize_in(struct bpf_fuse_args *fa, struct fuse_fsync_in *in,
+ struct file *file, loff_t start, loff_t end, int datasync)
+{
+ struct fuse_file *fuse_file = file->private_data;
+
+ *in = (struct fuse_fsync_in) {
+ .fh = fuse_file->fh,
+ .fsync_flags = datasync ? FUSE_FSYNC_FDATASYNC : 0,
+ };
+
+ *fa = (struct bpf_fuse_args) {
+ .info = (struct bpf_fuse_meta_info) {
+ .nodeid = get_fuse_inode(file->f_inode)->nodeid,
+ .opcode = FUSE_FSYNCDIR,
+ },
+ .in_numargs = 1,
+ .in_args[0].size = sizeof(*in),
+ .in_args[0].value = in,
+ .flags = FUSE_BPF_FORCE,
+ };
+
+ return 0;
+}
+
+static int fuse_dir_fsync_initialize_out(struct bpf_fuse_args *fa, struct fuse_fsync_in *ffi,
+ struct file *file, loff_t start, loff_t end, int datasync)
+{
+ return 0;
+}
+
+int fuse_bpf_dir_fsync(int *out, struct inode *inode, struct file *file, loff_t start, loff_t end, int datasync)
+{
+ return bpf_fuse_backing(inode, struct fuse_fsync_in, out,
+ fuse_dir_fsync_initialize_in, fuse_dir_fsync_initialize_out,
+ fuse_fsync_backing, fuse_fsync_finalize,
+ file, start, end, datasync);
+}
+
static inline void fuse_bpf_aio_put(struct fuse_bpf_aio_req *aio_req)
{
if (refcount_dec_and_test(&aio_req->ref))
diff --git a/fs/fuse/dir.c b/fs/fuse/dir.c
index a763a45fa973..5ce65f696980 100644
--- a/fs/fuse/dir.c
+++ b/fs/fuse/dir.c
@@ -1666,6 +1666,9 @@ static int fuse_dir_fsync(struct file *file, loff_t start, loff_t end,
if (fuse_is_bad(inode))
return -EIO;

+ if (fuse_bpf_dir_fsync(&err, inode, file, start, end, datasync))
+ return err;
+
if (fc->no_fsyncdir)
return 0;

diff --git a/fs/fuse/file.c b/fs/fuse/file.c
index 5f19ef5bf124..a4a0aeb28e4a 100644
--- a/fs/fuse/file.c
+++ b/fs/fuse/file.c
@@ -554,10 +554,14 @@ static int fuse_flush(struct file *file, fl_owner_t id)
struct inode *inode = file_inode(file);
struct fuse_mount *fm = get_fuse_mount(inode);
struct fuse_file *ff = file->private_data;
+ int err;

if (fuse_is_bad(inode))
return -EIO;

+ if (fuse_bpf_flush(&err, file_inode(file), file, id))
+ return err;
+
if (ff->open_flags & FOPEN_NOFLUSH && !fm->fc->writeback_cache)
return 0;

@@ -615,6 +619,9 @@ static int fuse_fsync(struct file *file, loff_t start, loff_t end,
if (fuse_is_bad(inode))
return -EIO;

+ if (fuse_bpf_fsync(&err, inode, file, start, end, datasync))
+ return err;
+
inode_lock(inode);

/*
diff --git a/fs/fuse/fuse_i.h b/fs/fuse/fuse_i.h
index fb3a77b79b0f..e60207bf66de 100644
--- a/fs/fuse/fuse_i.h
+++ b/fs/fuse/fuse_i.h
@@ -1414,7 +1414,10 @@ int fuse_bpf_rmdir(int *out, struct inode *dir, struct dentry *entry);
int fuse_bpf_unlink(int *out, struct inode *dir, struct dentry *entry);
int fuse_bpf_release(int *out, struct inode *inode, struct file *file);
int fuse_bpf_releasedir(int *out, struct inode *inode, struct file *file);
+int fuse_bpf_flush(int *out, struct inode *inode, struct file *file, fl_owner_t id);
int fuse_bpf_lseek(loff_t *out, struct inode *inode, struct file *file, loff_t offset, int whence);
+int fuse_bpf_fsync(int *out, struct inode *inode, struct file *file, loff_t start, loff_t end, int datasync);
+int fuse_bpf_dir_fsync(int *out, struct inode *inode, struct file *file, loff_t start, loff_t end, int datasync);
int fuse_bpf_file_read_iter(ssize_t *out, struct inode *inode, struct kiocb *iocb, struct iov_iter *to);
int fuse_bpf_file_write_iter(ssize_t *out, struct inode *inode, struct kiocb *iocb, struct iov_iter *from);
int fuse_bpf_file_fallocate(int *out, struct inode *inode, struct file *file, int mode, loff_t offset, loff_t length);
@@ -1465,11 +1468,26 @@ static inline int fuse_bpf_releasedir(int *out, struct inode *inode, struct file
return 0;
}

+static inline int fuse_bpf_flush(int *out, struct inode *inode, struct file *file, fl_owner_t id)
+{
+ return 0;
+}
+
static inline int fuse_bpf_lseek(loff_t *out, struct inode *inode, struct file *file, loff_t offset, int whence)
{
return 0;
}

+static inline int fuse_bpf_fsync(int *out, struct inode *inode, struct file *file, loff_t start, loff_t end, int datasync)
+{
+ return 0;
+}
+
+static inline int fuse_bpf_dir_fsync(int *out, struct inode *inode, struct file *file, loff_t start, loff_t end, int datasync)
+{
+ return 0;
+}
+
static inline int fuse_bpf_file_read_iter(ssize_t *out, struct inode *inode, struct kiocb *iocb, struct iov_iter *to)
{
return 0;
--
2.40.0.634.g4ca3ef3211-goog

2023-04-18 01:44:23

by Daniel Rosenberg

[permalink] [raw]
Subject: [RFC PATCH v3 25/37] fuse-bpf: allow mounting with no userspace daemon

When using fuse-bpf in pure passthrough mode, we don't explicitly need a
userspace daemon. This allows simple testing of the backing operations.

Signed-off-by: Daniel Rosenberg <[email protected]>
Signed-off-by: Paul Lawrence <[email protected]>
---
fs/fuse/fuse_i.h | 4 ++++
fs/fuse/inode.c | 25 +++++++++++++++++++------
2 files changed, 23 insertions(+), 6 deletions(-)

diff --git a/fs/fuse/fuse_i.h b/fs/fuse/fuse_i.h
index 121d31a04e79..2bd45c8658e8 100644
--- a/fs/fuse/fuse_i.h
+++ b/fs/fuse/fuse_i.h
@@ -566,6 +566,7 @@ struct fuse_fs_context {
bool no_control:1;
bool no_force_umount:1;
bool legacy_opts_show:1;
+ bool no_daemon:1;
enum fuse_dax_mode dax_mode;
unsigned int max_read;
unsigned int blksize;
@@ -847,6 +848,9 @@ struct fuse_conn {
/* Is tmpfile not implemented by fs? */
unsigned int no_tmpfile:1;

+ /** BPF Only, no Daemon running */
+ unsigned int no_daemon:1;
+
/** The number of requests waiting for completion */
atomic_t num_waiting;

diff --git a/fs/fuse/inode.c b/fs/fuse/inode.c
index 3dfb9cfb6e73..31f34962bc9b 100644
--- a/fs/fuse/inode.c
+++ b/fs/fuse/inode.c
@@ -756,6 +756,7 @@ enum {
OPT_MAX_READ,
OPT_BLKSIZE,
OPT_ROOT_DIR,
+ OPT_NO_DAEMON,
OPT_ERR
};

@@ -771,6 +772,7 @@ static const struct fs_parameter_spec fuse_fs_parameters[] = {
fsparam_u32 ("blksize", OPT_BLKSIZE),
fsparam_string ("subtype", OPT_SUBTYPE),
fsparam_u32 ("root_dir", OPT_ROOT_DIR),
+ fsparam_flag ("no_daemon", OPT_NO_DAEMON),
{}
};

@@ -860,6 +862,11 @@ static int fuse_parse_param(struct fs_context *fsc, struct fs_parameter *param)
return invalfc(fsc, "Unable to open root directory");
break;

+ case OPT_NO_DAEMON:
+ ctx->no_daemon = true;
+ ctx->fd_present = true;
+ break;
+
default:
return -EINVAL;
}
@@ -1419,7 +1426,7 @@ void fuse_send_init(struct fuse_mount *fm)
ia->args.nocreds = true;
ia->args.end = process_init_reply;

- if (fuse_simple_background(fm, &ia->args, GFP_KERNEL) != 0)
+ if (unlikely(fm->fc->no_daemon) || fuse_simple_background(fm, &ia->args, GFP_KERNEL) != 0)
process_init_reply(fm, &ia->args, -ENOTCONN);
}
EXPORT_SYMBOL_GPL(fuse_send_init);
@@ -1694,6 +1701,7 @@ int fuse_fill_super_common(struct super_block *sb, struct fuse_fs_context *ctx)
fc->destroy = ctx->destroy;
fc->no_control = ctx->no_control;
fc->no_force_umount = ctx->no_force_umount;
+ fc->no_daemon = ctx->no_daemon;

err = -ENOMEM;
root = fuse_get_root_inode(sb, ctx->rootmode, ctx->root_dir);
@@ -1740,7 +1748,7 @@ static int fuse_fill_super(struct super_block *sb, struct fs_context *fsc)
struct fuse_fs_context *ctx = fsc->fs_private;
int err;

- if (!ctx->file || !ctx->rootmode_present ||
+ if (!!ctx->file == ctx->no_daemon || !ctx->rootmode_present ||
!ctx->user_id_present || !ctx->group_id_present)
return -EINVAL;

@@ -1748,10 +1756,12 @@ static int fuse_fill_super(struct super_block *sb, struct fs_context *fsc)
* Require mount to happen from the same user namespace which
* opened /dev/fuse to prevent potential attacks.
*/
- if ((ctx->file->f_op != &fuse_dev_operations) ||
- (ctx->file->f_cred->user_ns != sb->s_user_ns))
- return -EINVAL;
- ctx->fudptr = &ctx->file->private_data;
+ if (ctx->file) {
+ if ((ctx->file->f_op != &fuse_dev_operations) ||
+ (ctx->file->f_cred->user_ns != sb->s_user_ns))
+ return -EINVAL;
+ ctx->fudptr = &ctx->file->private_data;
+ }

err = fuse_fill_super_common(sb, ctx);
if (err)
@@ -1801,6 +1811,9 @@ static int fuse_get_tree(struct fs_context *fsc)

fsc->s_fs_info = fm;

+ if (ctx->no_daemon)
+ return get_tree_nodev(fsc, fuse_fill_super);;
+
if (ctx->fd_present)
ctx->file = fget(ctx->fd);

--
2.40.0.634.g4ca3ef3211-goog

2023-04-18 01:44:45

by Daniel Rosenberg

[permalink] [raw]
Subject: [RFC PATCH v3 21/37] fuse-bpf: Add attr support

This adds backing support for FUSE_GETATTR, FUSE_SETATTR, and FUSE_STATFS

Signed-off-by: Daniel Rosenberg <[email protected]>
Signed-off-by: Paul Lawrence <[email protected]>
---
fs/fuse/backing.c | 288 ++++++++++++++++++++++++++++++++++++++++++++++
fs/fuse/dir.c | 68 ++---------
fs/fuse/fuse_i.h | 102 ++++++++++++++++
fs/fuse/inode.c | 17 +--
4 files changed, 405 insertions(+), 70 deletions(-)

diff --git a/fs/fuse/backing.c b/fs/fuse/backing.c
index d3a706b55905..6a6130a16d2b 100644
--- a/fs/fuse/backing.c
+++ b/fs/fuse/backing.c
@@ -2066,6 +2066,294 @@ int fuse_bpf_unlink(int *out, struct inode *dir, struct dentry *entry)
dir, entry);
}

+struct fuse_getattr_args {
+ struct fuse_getattr_in in;
+ struct fuse_attr_out out;
+};
+
+static int fuse_getattr_initialize_in(struct bpf_fuse_args *fa, struct fuse_getattr_args *args,
+ const struct dentry *entry, struct kstat *stat,
+ u32 request_mask, unsigned int flags)
+{
+ args->in = (struct fuse_getattr_in) {
+ .getattr_flags = flags,
+ .fh = -1, /* TODO is this OK? */
+ };
+
+ *fa = (struct bpf_fuse_args) {
+ .info = (struct bpf_fuse_meta_info) {
+ .nodeid = get_node_id(entry->d_inode),
+ .opcode = FUSE_GETATTR,
+ },
+ .in_numargs = 1,
+ .in_args[0] = (struct bpf_fuse_arg) {
+ .size = sizeof(args->in),
+ .value = &args->in,
+ },
+ };
+
+ return 0;
+}
+
+static int fuse_getattr_initialize_out(struct bpf_fuse_args *fa, struct fuse_getattr_args *args,
+ const struct dentry *entry, struct kstat *stat,
+ u32 request_mask, unsigned int flags)
+{
+ args->out = (struct fuse_attr_out) { 0 };
+
+ fa->out_numargs = 1;
+ fa->out_args[0] = (struct bpf_fuse_arg) {
+ .size = sizeof(args->out),
+ .value = &args->out,
+ };
+
+ return 0;
+}
+
+static int fuse_getattr_backing(struct bpf_fuse_args *fa, int *out,
+ const struct dentry *entry, struct kstat *stat,
+ u32 request_mask, unsigned int flags)
+{
+ struct path *backing_path = &get_fuse_dentry(entry)->backing_path;
+ struct inode *backing_inode = backing_path->dentry->d_inode;
+ struct fuse_attr_out *fao = fa->out_args[0].value;
+ struct kstat tmp;
+
+ if (!stat)
+ stat = &tmp;
+
+ *out = vfs_getattr(backing_path, stat, request_mask, flags);
+
+ if (!*out)
+ fuse_stat_to_attr(get_fuse_conn(entry->d_inode), backing_inode,
+ stat, &fao->attr);
+
+ return 0;
+}
+
+static int finalize_attr(struct inode *inode, struct fuse_attr_out *outarg,
+ u64 attr_version, struct kstat *stat)
+{
+ int err = 0;
+
+ if (fuse_invalid_attr(&outarg->attr) ||
+ ((inode->i_mode ^ outarg->attr.mode) & S_IFMT)) {
+ fuse_make_bad(inode);
+ err = -EIO;
+ } else {
+ fuse_change_attributes(inode, &outarg->attr,
+ attr_timeout(outarg),
+ attr_version);
+ if (stat)
+ fuse_fillattr(inode, &outarg->attr, stat);
+ }
+ return err;
+}
+
+static int fuse_getattr_finalize(struct bpf_fuse_args *fa, int *out,
+ const struct dentry *entry, struct kstat *stat,
+ u32 request_mask, unsigned int flags)
+{
+ struct fuse_attr_out *outarg = fa->out_args[0].value;
+ struct inode *inode = entry->d_inode;
+ u64 attr_version = fuse_get_attr_version(get_fuse_mount(inode)->fc);
+
+ /* TODO: Ensure this doesn't happen if we had an error getting attrs in
+ * backing.
+ */
+ *out = finalize_attr(inode, outarg, attr_version, stat);
+ return 0;
+}
+
+int fuse_bpf_getattr(int *out, struct inode *inode, const struct dentry *entry, struct kstat *stat,
+ u32 request_mask, unsigned int flags)
+{
+ return bpf_fuse_backing(inode, struct fuse_getattr_args, out,
+ fuse_getattr_initialize_in, fuse_getattr_initialize_out,
+ fuse_getattr_backing, fuse_getattr_finalize,
+ entry, stat, request_mask, flags);
+}
+
+static void fattr_to_iattr(struct fuse_conn *fc,
+ const struct fuse_setattr_in *arg,
+ struct iattr *iattr)
+{
+ unsigned int fvalid = arg->valid;
+
+ if (fvalid & FATTR_MODE)
+ iattr->ia_valid |= ATTR_MODE, iattr->ia_mode = arg->mode;
+ if (fvalid & FATTR_UID) {
+ iattr->ia_valid |= ATTR_UID;
+ iattr->ia_uid = make_kuid(fc->user_ns, arg->uid);
+ }
+ if (fvalid & FATTR_GID) {
+ iattr->ia_valid |= ATTR_GID;
+ iattr->ia_gid = make_kgid(fc->user_ns, arg->gid);
+ }
+ if (fvalid & FATTR_SIZE)
+ iattr->ia_valid |= ATTR_SIZE, iattr->ia_size = arg->size;
+ if (fvalid & FATTR_ATIME) {
+ iattr->ia_valid |= ATTR_ATIME;
+ iattr->ia_atime.tv_sec = arg->atime;
+ iattr->ia_atime.tv_nsec = arg->atimensec;
+ if (!(fvalid & FATTR_ATIME_NOW))
+ iattr->ia_valid |= ATTR_ATIME_SET;
+ }
+ if (fvalid & FATTR_MTIME) {
+ iattr->ia_valid |= ATTR_MTIME;
+ iattr->ia_mtime.tv_sec = arg->mtime;
+ iattr->ia_mtime.tv_nsec = arg->mtimensec;
+ if (!(fvalid & FATTR_MTIME_NOW))
+ iattr->ia_valid |= ATTR_MTIME_SET;
+ }
+ if (fvalid & FATTR_CTIME) {
+ iattr->ia_valid |= ATTR_CTIME;
+ iattr->ia_ctime.tv_sec = arg->ctime;
+ iattr->ia_ctime.tv_nsec = arg->ctimensec;
+ }
+}
+
+struct fuse_setattr_args {
+ struct fuse_setattr_in in;
+ struct fuse_attr_out out;
+};
+
+static int fuse_setattr_initialize_in(struct bpf_fuse_args *fa, struct fuse_setattr_args *args,
+ struct dentry *dentry, struct iattr *attr, struct file *file)
+{
+ struct fuse_conn *fc = get_fuse_conn(dentry->d_inode);
+
+ *args = (struct fuse_setattr_args) { 0 };
+ iattr_to_fattr(fc, attr, &args->in, true);
+
+ *fa = (struct bpf_fuse_args) {
+ .info = (struct bpf_fuse_meta_info) {
+ .opcode = FUSE_SETATTR,
+ .nodeid = get_node_id(dentry->d_inode),
+ },
+ .in_numargs = 1,
+ .in_args[0].size = sizeof(args->in),
+ .in_args[0].value = &args->in,
+ };
+
+ return 0;
+}
+
+static int fuse_setattr_initialize_out(struct bpf_fuse_args *fa, struct fuse_setattr_args *args,
+ struct dentry *dentry, struct iattr *attr, struct file *file)
+{
+ fa->out_numargs = 1;
+ fa->out_args[0].size = sizeof(args->out);
+ fa->out_args[0].value = &args->out;
+
+ return 0;
+}
+
+static int fuse_setattr_backing(struct bpf_fuse_args *fa, int *out,
+ struct dentry *dentry, struct iattr *attr, struct file *file)
+{
+ struct fuse_conn *fc = get_fuse_conn(dentry->d_inode);
+ const struct fuse_setattr_in *fsi = fa->in_args[0].value;
+ struct iattr new_attr = { 0 };
+ struct path *backing_path = &get_fuse_dentry(dentry)->backing_path;
+
+ fattr_to_iattr(fc, fsi, &new_attr);
+ /* TODO: Some info doesn't get saved by the attr->fattr->attr transition
+ * When we actually allow the bpf to change these, we may have to consider
+ * the extra flags more, or pass more info into the bpf. Until then we can
+ * keep everything except for ATTR_FILE, since we'd need a file on the
+ * lower fs. For what it's worth, neither f2fs nor ext4 make use of that
+ * even if it is present.
+ */
+ new_attr.ia_valid = attr->ia_valid & ~ATTR_FILE;
+ inode_lock(d_inode(backing_path->dentry));
+ *out = notify_change(&nop_mnt_idmap, backing_path->dentry, &new_attr,
+ NULL);
+ inode_unlock(d_inode(backing_path->dentry));
+
+ if (*out == 0 && (new_attr.ia_valid & ATTR_SIZE))
+ i_size_write(dentry->d_inode, new_attr.ia_size);
+ return 0;
+}
+
+static int fuse_setattr_finalize(struct bpf_fuse_args *fa, int *out,
+ struct dentry *dentry, struct iattr *attr, struct file *file)
+{
+ return 0;
+}
+
+int fuse_bpf_setattr(int *out, struct inode *inode, struct dentry *dentry, struct iattr *attr, struct file *file)
+{
+ return bpf_fuse_backing(inode, struct fuse_setattr_args, out,
+ fuse_setattr_initialize_in, fuse_setattr_initialize_out,
+ fuse_setattr_backing, fuse_setattr_finalize,
+ dentry, attr, file);
+}
+
+static int fuse_statfs_initialize_in(struct bpf_fuse_args *fa, struct fuse_statfs_out *out,
+ struct dentry *dentry, struct kstatfs *buf)
+{
+ *fa = (struct bpf_fuse_args) {
+ .info = (struct bpf_fuse_meta_info) {
+ .nodeid = get_node_id(d_inode(dentry)),
+ .opcode = FUSE_STATFS,
+ },
+ };
+
+ return 0;
+}
+
+static int fuse_statfs_initialize_out(struct bpf_fuse_args *fa, struct fuse_statfs_out *out,
+ struct dentry *dentry, struct kstatfs *buf)
+{
+ *out = (struct fuse_statfs_out) { 0 };
+
+ fa->out_numargs = 1;
+ fa->out_args[0].size = sizeof(*out);
+ fa->out_args[0].value = out;
+
+ return 0;
+}
+
+static int fuse_statfs_backing(struct bpf_fuse_args *fa, int *out,
+ struct dentry *dentry, struct kstatfs *buf)
+{
+ struct path backing_path;
+ struct fuse_statfs_out *fso = fa->out_args[0].value;
+
+ *out = 0;
+ get_fuse_backing_path(dentry, &backing_path);
+ if (!backing_path.dentry)
+ return -EBADF;
+ *out = vfs_statfs(&backing_path, buf);
+ path_put(&backing_path);
+ buf->f_type = FUSE_SUPER_MAGIC;
+
+ //TODO Provide postfilter opportunity to modify
+ if (!*out)
+ convert_statfs_to_fuse(&fso->st, buf);
+
+ return 0;
+}
+
+static int fuse_statfs_finalize(struct bpf_fuse_args *fa, int *out,
+ struct dentry *dentry, struct kstatfs *buf)
+{
+ struct fuse_statfs_out *fso = fa->out_args[0].value;
+
+ if (!fa->info.error_in)
+ convert_fuse_statfs(buf, &fso->st);
+ return 0;
+}
+
+int fuse_bpf_statfs(int *out, struct inode *inode, struct dentry *dentry, struct kstatfs *buf)
+{
+ return bpf_fuse_backing(dentry->d_inode, struct fuse_statfs_out, out,
+ fuse_statfs_initialize_in, fuse_statfs_initialize_out,
+ fuse_statfs_backing, fuse_statfs_finalize,
+ dentry, buf);
+}
+
struct fuse_read_args {
struct fuse_read_in in;
struct fuse_read_out out;
diff --git a/fs/fuse/dir.c b/fs/fuse/dir.c
index 086e3ecada19..7d589241c9b0 100644
--- a/fs/fuse/dir.c
+++ b/fs/fuse/dir.c
@@ -1236,7 +1236,7 @@ static int fuse_link(struct dentry *entry, struct inode *newdir,
return err;
}

-static void fuse_fillattr(struct inode *inode, struct fuse_attr *attr,
+void fuse_fillattr(struct inode *inode, struct fuse_attr *attr,
struct kstat *stat)
{
unsigned int blkbits;
@@ -1313,6 +1313,7 @@ static int fuse_do_getattr(struct inode *inode, struct kstat *stat,
}

static int fuse_update_get_attr(struct inode *inode, struct file *file,
+ const struct path *path,
struct kstat *stat, u32 request_mask,
unsigned int flags)
{
@@ -1322,6 +1323,9 @@ static int fuse_update_get_attr(struct inode *inode, struct file *file,
u32 inval_mask = READ_ONCE(fi->inval_mask);
u32 cache_mask = fuse_get_cache_mask(inode);

+ if (fuse_bpf_getattr(&err, inode, path->dentry, stat, request_mask, flags))
+ return err;
+
if (flags & AT_STATX_FORCE_SYNC)
sync = true;
else if (flags & AT_STATX_DONT_SYNC)
@@ -1345,7 +1349,7 @@ static int fuse_update_get_attr(struct inode *inode, struct file *file,

int fuse_update_attributes(struct inode *inode, struct file *file, u32 mask)
{
- return fuse_update_get_attr(inode, file, NULL, mask, 0);
+ return fuse_update_get_attr(inode, file, &file->f_path, NULL, mask, 0);
}

int fuse_reverse_inval_entry(struct fuse_conn *fc, u64 parent_nodeid,
@@ -1714,58 +1718,6 @@ static long fuse_dir_compat_ioctl(struct file *file, unsigned int cmd,
FUSE_IOCTL_COMPAT | FUSE_IOCTL_DIR);
}

-static inline bool update_mtime(unsigned int ivalid, bool trust_local_mtime)
-{
- /* Always update if mtime is explicitly set */
- if (ivalid & ATTR_MTIME_SET)
- return true;
-
- /* Or if kernel i_mtime is the official one */
- if (trust_local_mtime)
- return true;
-
- /* If it's an open(O_TRUNC) or an ftruncate(), don't update */
- if ((ivalid & ATTR_SIZE) && (ivalid & (ATTR_OPEN | ATTR_FILE)))
- return false;
-
- /* In all other cases update */
- return true;
-}
-
-static void iattr_to_fattr(struct fuse_conn *fc, struct iattr *iattr,
- struct fuse_setattr_in *arg, bool trust_local_cmtime)
-{
- unsigned ivalid = iattr->ia_valid;
-
- if (ivalid & ATTR_MODE)
- arg->valid |= FATTR_MODE, arg->mode = iattr->ia_mode;
- if (ivalid & ATTR_UID)
- arg->valid |= FATTR_UID, arg->uid = from_kuid(fc->user_ns, iattr->ia_uid);
- if (ivalid & ATTR_GID)
- arg->valid |= FATTR_GID, arg->gid = from_kgid(fc->user_ns, iattr->ia_gid);
- if (ivalid & ATTR_SIZE)
- arg->valid |= FATTR_SIZE, arg->size = iattr->ia_size;
- if (ivalid & ATTR_ATIME) {
- arg->valid |= FATTR_ATIME;
- arg->atime = iattr->ia_atime.tv_sec;
- arg->atimensec = iattr->ia_atime.tv_nsec;
- if (!(ivalid & ATTR_ATIME_SET))
- arg->valid |= FATTR_ATIME_NOW;
- }
- if ((ivalid & ATTR_MTIME) && update_mtime(ivalid, trust_local_cmtime)) {
- arg->valid |= FATTR_MTIME;
- arg->mtime = iattr->ia_mtime.tv_sec;
- arg->mtimensec = iattr->ia_mtime.tv_nsec;
- if (!(ivalid & ATTR_MTIME_SET) && !trust_local_cmtime)
- arg->valid |= FATTR_MTIME_NOW;
- }
- if ((ivalid & ATTR_CTIME) && trust_local_cmtime) {
- arg->valid |= FATTR_CTIME;
- arg->ctime = iattr->ia_ctime.tv_sec;
- arg->ctimensec = iattr->ia_ctime.tv_nsec;
- }
-}
-
/*
* Prevent concurrent writepages on inode
*
@@ -1880,6 +1832,9 @@ int fuse_do_setattr(struct dentry *dentry, struct iattr *attr,
bool trust_local_cmtime = is_wb;
bool fault_blocked = false;

+ if (fuse_bpf_setattr(&err, inode, dentry, attr, file))
+ return err;
+
if (!fc->default_permissions)
attr->ia_valid |= ATTR_FORCE;

@@ -2059,7 +2014,8 @@ static int fuse_setattr(struct mnt_idmap *idmap, struct dentry *entry,
* ia_mode calculation may have used stale i_mode.
* Refresh and recalculate.
*/
- ret = fuse_do_getattr(inode, NULL, file);
+ if (!fuse_bpf_getattr(&ret, inode, entry, NULL, 0, 0))
+ ret = fuse_do_getattr(inode, NULL, file);
if (ret)
return ret;

@@ -2116,7 +2072,7 @@ static int fuse_getattr(struct mnt_idmap *idmap,
return -EACCES;
}

- return fuse_update_get_attr(inode, NULL, stat, request_mask, flags);
+ return fuse_update_get_attr(inode, NULL, path, stat, request_mask, flags);
}

static const struct inode_operations fuse_dir_inode_operations = {
diff --git a/fs/fuse/fuse_i.h b/fs/fuse/fuse_i.h
index 5c8bd2f76fb9..17899a1fe885 100644
--- a/fs/fuse/fuse_i.h
+++ b/fs/fuse/fuse_i.h
@@ -1427,6 +1427,10 @@ int fuse_bpf_file_read_iter(ssize_t *out, struct inode *inode, struct kiocb *ioc
int fuse_bpf_file_write_iter(ssize_t *out, struct inode *inode, struct kiocb *iocb, struct iov_iter *from);
int fuse_bpf_file_fallocate(int *out, struct inode *inode, struct file *file, int mode, loff_t offset, loff_t length);
int fuse_bpf_lookup(struct dentry **out, struct inode *dir, struct dentry *entry, unsigned int flags);
+int fuse_bpf_getattr(int *out, struct inode *inode, const struct dentry *entry, struct kstat *stat,
+ u32 request_mask, unsigned int flags);
+int fuse_bpf_setattr(int *out, struct inode *inode, struct dentry *dentry, struct iattr *attr, struct file *file);
+int fuse_bpf_statfs(int *out, struct inode *inode, struct dentry *dentry, struct kstatfs *buf);
int fuse_bpf_readdir(int *out, struct inode *inode, struct file *file, struct dir_context *ctx);
int fuse_bpf_access(int *out, struct inode *inode, int mask);

@@ -1526,6 +1530,22 @@ static inline int fuse_bpf_lookup(struct dentry **out, struct inode *dir, struct
return 0;
}

+static inline int fuse_bpf_getattr(int *out, struct inode *inode, const struct dentry *entry, struct kstat *stat,
+ u32 request_mask, unsigned int flags)
+{
+ return 0;
+}
+
+static inline int fuse_bpf_setattr(int *out, struct inode *inode, struct dentry *dentry, struct iattr *attr, struct file *file)
+{
+ return 0;
+}
+
+static inline int fuse_bpf_statfs(int *out, struct inode *inode, struct dentry *dentry, struct kstatfs *buf)
+{
+ return 0;
+}
+
static inline int fuse_bpf_readdir(int *out, struct inode *inode, struct file *file, struct dir_context *ctx)
{
return 0;
@@ -1571,6 +1591,88 @@ static inline u64 attr_timeout(struct fuse_attr_out *o)
return time_to_jiffies(o->attr_valid, o->attr_valid_nsec);
}

+static inline bool update_mtime(unsigned int ivalid, bool trust_local_mtime)
+{
+ /* Always update if mtime is explicitly set */
+ if (ivalid & ATTR_MTIME_SET)
+ return true;
+
+ /* Or if kernel i_mtime is the official one */
+ if (trust_local_mtime)
+ return true;
+
+ /* If it's an open(O_TRUNC) or an ftruncate(), don't update */
+ if ((ivalid & ATTR_SIZE) && (ivalid & (ATTR_OPEN | ATTR_FILE)))
+ return false;
+
+ /* In all other cases update */
+ return true;
+}
+
+void fuse_fillattr(struct inode *inode, struct fuse_attr *attr,
+ struct kstat *stat);
+
+static inline void iattr_to_fattr(struct fuse_conn *fc, struct iattr *iattr,
+ struct fuse_setattr_in *arg, bool trust_local_cmtime)
+{
+ unsigned int ivalid = iattr->ia_valid;
+
+ if (ivalid & ATTR_MODE)
+ arg->valid |= FATTR_MODE, arg->mode = iattr->ia_mode;
+ if (ivalid & ATTR_UID)
+ arg->valid |= FATTR_UID, arg->uid = from_kuid(fc->user_ns, iattr->ia_uid);
+ if (ivalid & ATTR_GID)
+ arg->valid |= FATTR_GID, arg->gid = from_kgid(fc->user_ns, iattr->ia_gid);
+ if (ivalid & ATTR_SIZE)
+ arg->valid |= FATTR_SIZE, arg->size = iattr->ia_size;
+ if (ivalid & ATTR_ATIME) {
+ arg->valid |= FATTR_ATIME;
+ arg->atime = iattr->ia_atime.tv_sec;
+ arg->atimensec = iattr->ia_atime.tv_nsec;
+ if (!(ivalid & ATTR_ATIME_SET))
+ arg->valid |= FATTR_ATIME_NOW;
+ }
+ if ((ivalid & ATTR_MTIME) && update_mtime(ivalid, trust_local_cmtime)) {
+ arg->valid |= FATTR_MTIME;
+ arg->mtime = iattr->ia_mtime.tv_sec;
+ arg->mtimensec = iattr->ia_mtime.tv_nsec;
+ if (!(ivalid & ATTR_MTIME_SET) && !trust_local_cmtime)
+ arg->valid |= FATTR_MTIME_NOW;
+ }
+ if ((ivalid & ATTR_CTIME) && trust_local_cmtime) {
+ arg->valid |= FATTR_CTIME;
+ arg->ctime = iattr->ia_ctime.tv_sec;
+ arg->ctimensec = iattr->ia_ctime.tv_nsec;
+ }
+}
+
+static inline void convert_statfs_to_fuse(struct fuse_kstatfs *attr, struct kstatfs *stbuf)
+{
+ attr->bsize = stbuf->f_bsize;
+ attr->frsize = stbuf->f_frsize;
+ attr->blocks = stbuf->f_blocks;
+ attr->bfree = stbuf->f_bfree;
+ attr->bavail = stbuf->f_bavail;
+ attr->files = stbuf->f_files;
+ attr->ffree = stbuf->f_ffree;
+ attr->namelen = stbuf->f_namelen;
+ /* fsid is left zero */
+}
+
+static inline void convert_fuse_statfs(struct kstatfs *stbuf, struct fuse_kstatfs *attr)
+{
+ stbuf->f_type = FUSE_SUPER_MAGIC;
+ stbuf->f_bsize = attr->bsize;
+ stbuf->f_frsize = attr->frsize;
+ stbuf->f_blocks = attr->blocks;
+ stbuf->f_bfree = attr->bfree;
+ stbuf->f_bavail = attr->bavail;
+ stbuf->f_files = attr->files;
+ stbuf->f_ffree = attr->ffree;
+ stbuf->f_namelen = attr->namelen;
+ /* fsid is left zero */
+}
+
#ifdef CONFIG_FUSE_BPF
int __init fuse_bpf_init(void);
void __exit fuse_bpf_cleanup(void);
diff --git a/fs/fuse/inode.c b/fs/fuse/inode.c
index fd00910f1eb1..3dfb9cfb6e73 100644
--- a/fs/fuse/inode.c
+++ b/fs/fuse/inode.c
@@ -623,20 +623,6 @@ static void fuse_send_destroy(struct fuse_mount *fm)
}
}

-static void convert_fuse_statfs(struct kstatfs *stbuf, struct fuse_kstatfs *attr)
-{
- stbuf->f_type = FUSE_SUPER_MAGIC;
- stbuf->f_bsize = attr->bsize;
- stbuf->f_frsize = attr->frsize;
- stbuf->f_blocks = attr->blocks;
- stbuf->f_bfree = attr->bfree;
- stbuf->f_bavail = attr->bavail;
- stbuf->f_files = attr->files;
- stbuf->f_ffree = attr->ffree;
- stbuf->f_namelen = attr->namelen;
- /* fsid is left zero */
-}
-
static int fuse_statfs(struct dentry *dentry, struct kstatfs *buf)
{
struct super_block *sb = dentry->d_sb;
@@ -650,6 +636,9 @@ static int fuse_statfs(struct dentry *dentry, struct kstatfs *buf)
return 0;
}

+ if (fuse_bpf_statfs(&err, dentry->d_inode, dentry, buf))
+ return err;
+
memset(&outarg, 0, sizeof(outarg));
args.in_numargs = 0;
args.opcode = FUSE_STATFS;
--
2.40.0.634.g4ca3ef3211-goog

2023-04-18 01:44:45

by Daniel Rosenberg

[permalink] [raw]
Subject: [RFC PATCH v3 22/37] fuse-bpf: Add support for FUSE_COPY_FILE_RANGE

This adds backing support for FUSE_COPY_FILE_RANGE

Signed-off-by: Daniel Rosenberg <[email protected]>
Signed-off-by: Paul Lawrence <[email protected]>
---
fs/fuse/backing.c | 87 +++++++++++++++++++++++++++++++++++++++++++++++
fs/fuse/file.c | 4 +++
fs/fuse/fuse_i.h | 10 ++++++
3 files changed, 101 insertions(+)

diff --git a/fs/fuse/backing.c b/fs/fuse/backing.c
index 6a6130a16d2b..928b24db2303 100644
--- a/fs/fuse/backing.c
+++ b/fs/fuse/backing.c
@@ -801,6 +801,93 @@ int fuse_bpf_lseek(loff_t *out, struct inode *inode, struct file *file, loff_t o
file, offset, whence);
}

+struct fuse_copy_file_range_args {
+ struct fuse_copy_file_range_in in;
+ struct fuse_write_out out;
+};
+
+static int fuse_copy_file_range_initialize_in(struct bpf_fuse_args *fa,
+ struct fuse_copy_file_range_args *args,
+ struct file *file_in, loff_t pos_in, struct file *file_out,
+ loff_t pos_out, size_t len, unsigned int flags)
+{
+ struct fuse_file *fuse_file_in = file_in->private_data;
+ struct fuse_file *fuse_file_out = file_out->private_data;
+
+ args->in = (struct fuse_copy_file_range_in) {
+ .fh_in = fuse_file_in->fh,
+ .off_in = pos_in,
+ .nodeid_out = fuse_file_out->nodeid,
+ .fh_out = fuse_file_out->fh,
+ .off_out = pos_out,
+ .len = len,
+ .flags = flags,
+ };
+
+ *fa = (struct bpf_fuse_args) {
+ .info = (struct bpf_fuse_meta_info) {
+ .nodeid = get_node_id(file_in->f_inode),
+ .opcode = FUSE_COPY_FILE_RANGE,
+ },
+ .in_numargs = 1,
+ .in_args[0].size = sizeof(args->in),
+ .in_args[0].value = &args->in,
+ };
+
+ return 0;
+}
+
+static int fuse_copy_file_range_initialize_out(struct bpf_fuse_args *fa,
+ struct fuse_copy_file_range_args *args,
+ struct file *file_in, loff_t pos_in, struct file *file_out,
+ loff_t pos_out, size_t len, unsigned int flags)
+{
+ fa->out_numargs = 1;
+ fa->out_args[0].size = sizeof(args->out);
+ fa->out_args[0].value = &args->out;
+
+ return 0;
+}
+
+static int fuse_copy_file_range_backing(struct bpf_fuse_args *fa, ssize_t *out, struct file *file_in,
+ loff_t pos_in, struct file *file_out, loff_t pos_out, size_t len,
+ unsigned int flags)
+{
+ const struct fuse_copy_file_range_in *fci = fa->in_args[0].value;
+ struct fuse_file *fuse_file_in = file_in->private_data;
+ struct file *backing_file_in = fuse_file_in->backing_file;
+ struct fuse_file *fuse_file_out = file_out->private_data;
+ struct file *backing_file_out = fuse_file_out->backing_file;
+
+ /* TODO: Handle changing of in/out files */
+ if (backing_file_out)
+ *out = vfs_copy_file_range(backing_file_in, fci->off_in, backing_file_out,
+ fci->off_out, fci->len, fci->flags);
+ else
+ *out = generic_copy_file_range(file_in, pos_in, file_out, pos_out, len,
+ flags);
+ return 0;
+}
+
+static int fuse_copy_file_range_finalize(struct bpf_fuse_args *fa, ssize_t *out, struct file *file_in,
+ loff_t pos_in, struct file *file_out, loff_t pos_out, size_t len,
+ unsigned int flags)
+{
+ return 0;
+}
+
+int fuse_bpf_copy_file_range(ssize_t *out, struct inode *inode, struct file *file_in,
+ loff_t pos_in, struct file *file_out, loff_t pos_out, size_t len,
+ unsigned int flags)
+{
+ return bpf_fuse_backing(inode, struct fuse_copy_file_range_args, out,
+ fuse_copy_file_range_initialize_in,
+ fuse_copy_file_range_initialize_out,
+ fuse_copy_file_range_backing,
+ fuse_copy_file_range_finalize,
+ file_in, pos_in, file_out, pos_out, len, flags);
+}
+
static int fuse_fsync_initialize_in(struct bpf_fuse_args *fa, struct fuse_fsync_in *in,
struct file *file, loff_t start, loff_t end, int datasync)
{
diff --git a/fs/fuse/file.c b/fs/fuse/file.c
index a4a0aeb28e4a..8179afe28c6f 100644
--- a/fs/fuse/file.c
+++ b/fs/fuse/file.c
@@ -3199,6 +3199,10 @@ static ssize_t __fuse_copy_file_range(struct file *file_in, loff_t pos_in,
bool is_unstable = (!fc->writeback_cache) &&
((pos_out + len) > inode_out->i_size);

+ if (fuse_bpf_copy_file_range(&err, file_inode(file_in), file_in, pos_in,
+ file_out, pos_out, len, flags))
+ return err;
+
if (fc->no_copy_file_range)
return -EOPNOTSUPP;

diff --git a/fs/fuse/fuse_i.h b/fs/fuse/fuse_i.h
index 17899a1fe885..74540f308636 100644
--- a/fs/fuse/fuse_i.h
+++ b/fs/fuse/fuse_i.h
@@ -1421,6 +1421,9 @@ int fuse_bpf_release(int *out, struct inode *inode, struct file *file);
int fuse_bpf_releasedir(int *out, struct inode *inode, struct file *file);
int fuse_bpf_flush(int *out, struct inode *inode, struct file *file, fl_owner_t id);
int fuse_bpf_lseek(loff_t *out, struct inode *inode, struct file *file, loff_t offset, int whence);
+int fuse_bpf_copy_file_range(ssize_t *out, struct inode *inode, struct file *file_in, loff_t pos_in,
+ struct file *file_out, loff_t pos_out,
+ size_t len, unsigned int flags);
int fuse_bpf_fsync(int *out, struct inode *inode, struct file *file, loff_t start, loff_t end, int datasync);
int fuse_bpf_dir_fsync(int *out, struct inode *inode, struct file *file, loff_t start, loff_t end, int datasync);
int fuse_bpf_file_read_iter(ssize_t *out, struct inode *inode, struct kiocb *iocb, struct iov_iter *to);
@@ -1500,6 +1503,13 @@ static inline int fuse_bpf_lseek(loff_t *out, struct inode *inode, struct file *
return 0;
}

+static inline int fuse_bpf_copy_file_range(ssize_t *out, struct inode *inode, struct file *file_in, loff_t pos_in,
+ struct file *file_out, loff_t pos_out,
+ size_t len, unsigned int flags)
+{
+ return 0;
+}
+
static inline int fuse_bpf_fsync(int *out, struct inode *inode, struct file *file, loff_t start, loff_t end, int datasync)
{
return 0;
--
2.40.0.634.g4ca3ef3211-goog

2023-04-18 01:44:50

by Daniel Rosenberg

[permalink] [raw]
Subject: [RFC PATCH v3 23/37] fuse-bpf: Add xattr support

This adds support for FUSE_GETXATTR, FUSE_LISTXATTR, FUSE_SETXATTR, and
FUSE_REMOVEXATTR

Signed-off-by: Daniel Rosenberg <[email protected]>
Signed-off-by: Paul Lawrence <[email protected]>
---
fs/fuse/backing.c | 349 ++++++++++++++++++++++++++++++++++++++++++++++
fs/fuse/fuse_i.h | 30 ++++
fs/fuse/xattr.c | 18 +++
3 files changed, 397 insertions(+)

diff --git a/fs/fuse/backing.c b/fs/fuse/backing.c
index 928b24db2303..eb3eb184c867 100644
--- a/fs/fuse/backing.c
+++ b/fs/fuse/backing.c
@@ -982,6 +982,355 @@ int fuse_bpf_dir_fsync(int *out, struct inode *inode, struct file *file, loff_t
file, start, end, datasync);
}

+struct fuse_getxattr_args {
+ struct fuse_getxattr_in in;
+ struct fuse_buffer name;
+ struct fuse_buffer value;
+ struct fuse_getxattr_out out;
+};
+
+static int fuse_getxattr_initialize_in(struct bpf_fuse_args *fa,
+ struct fuse_getxattr_args *args,
+ struct dentry *dentry, const char *name, void *value,
+ size_t size)
+{
+ *args = (struct fuse_getxattr_args) {
+ .in.size = size,
+ .name = (struct fuse_buffer) {
+ .data = (void *) name,
+ .size = strlen(name) + 1,
+ .max_size = XATTR_NAME_MAX + 1,
+ .flags = BPF_FUSE_MUST_ALLOCATE | BPF_FUSE_VARIABLE_SIZE,
+ },
+ };
+
+ *fa = (struct bpf_fuse_args) {
+ .info = (struct bpf_fuse_meta_info) {
+ .nodeid = get_fuse_inode(dentry->d_inode)->nodeid,
+ .opcode = FUSE_GETXATTR,
+ },
+ .in_numargs = 2,
+ .in_args[0] = (struct bpf_fuse_arg) {
+ .size = sizeof(args->in),
+ .value = &args->in,
+ },
+ .in_args[1] = (struct bpf_fuse_arg) {
+ .is_buffer = true,
+ .buffer = &args->name,
+ },
+ };
+
+ return 0;
+}
+
+static int fuse_getxattr_initialize_out(struct bpf_fuse_args *fa,
+ struct fuse_getxattr_args *args,
+ struct dentry *dentry, const char *name, void *value,
+ size_t size)
+{
+ fa->flags = size ? FUSE_BPF_OUT_ARGVAR : 0;
+ fa->out_numargs = 1;
+ if (size) {
+ args->value = (struct fuse_buffer) {
+ .data = (void *) value,
+ .size = size,
+ .alloc_size = size,
+ .max_size = size,
+ .flags = BPF_FUSE_VARIABLE_SIZE,
+ };
+ fa->out_args[0].is_buffer = true;
+ fa->out_args[0].buffer = &args->value;
+ } else {
+ fa->out_args[0].size = sizeof(args->out);
+ fa->out_args[0].value = &args->out;
+ }
+ return 0;
+}
+
+static int fuse_getxattr_backing(struct bpf_fuse_args *fa, int *out,
+ struct dentry *dentry, const char *name, void *value,
+ size_t size)
+{
+ ssize_t ret;
+
+ if (fa->in_args[1].buffer->flags & BPF_FUSE_MODIFIED) {
+ // Ensure bpf provided string is null terminated
+ char *new_name = fa->in_args[1].buffer->data;
+ new_name[fa->in_args[1].buffer->size - 1] = 0;
+ }
+ ret = vfs_getxattr(&nop_mnt_idmap,
+ get_fuse_dentry(dentry)->backing_path.dentry,
+ fa->in_args[1].buffer->data, value, size);
+
+ if (fa->flags & FUSE_BPF_OUT_ARGVAR)
+ fa->out_args[0].buffer->size = ret;
+ else
+ ((struct fuse_getxattr_out *)fa->out_args[0].value)->size = ret;
+
+ return 0;
+}
+
+static int fuse_getxattr_finalize(struct bpf_fuse_args *fa, int *out,
+ struct dentry *dentry, const char *name, void *value,
+ size_t size)
+{
+ struct fuse_getxattr_out *fgo;
+
+ if (fa->flags & FUSE_BPF_OUT_ARGVAR) {
+ *out = fa->out_args[0].buffer->size;
+ return 0;
+ }
+
+ fgo = fa->out_args[0].value;
+
+ *out = fgo->size;
+ return 0;
+}
+
+int fuse_bpf_getxattr(int *out, struct inode *inode, struct dentry *dentry, const char *name,
+ void *value, size_t size)
+{
+ return bpf_fuse_backing(inode, struct fuse_getxattr_args, out,
+ fuse_getxattr_initialize_in, fuse_getxattr_initialize_out,
+ fuse_getxattr_backing, fuse_getxattr_finalize,
+ dentry, name, value, size);
+}
+
+static int fuse_listxattr_initialize_in(struct bpf_fuse_args *fa,
+ struct fuse_getxattr_args *args,
+ struct dentry *dentry, char *list, size_t size)
+{
+ *args = (struct fuse_getxattr_args) {
+ .in.size = size,
+ };
+
+ *fa = (struct bpf_fuse_args) {
+ .info = (struct bpf_fuse_meta_info) {
+ .nodeid = get_fuse_inode(dentry->d_inode)->nodeid,
+ .opcode = FUSE_LISTXATTR,
+ },
+ .in_numargs = 1,
+ .in_args[0] =
+ (struct bpf_fuse_arg) {
+ .size = sizeof(args->in),
+ .value = &args->in,
+ },
+ };
+
+ return 0;
+}
+
+static int fuse_listxattr_initialize_out(struct bpf_fuse_args *fa,
+ struct fuse_getxattr_args *args,
+ struct dentry *dentry, char *list, size_t size)
+{
+ fa->out_numargs = 1;
+
+ if (size) {
+ args->value = (struct fuse_buffer) {
+ .data = list,
+ .size = size,
+ .alloc_size = size,
+ .max_size = size,
+ .flags = BPF_FUSE_VARIABLE_SIZE,
+ };
+ fa->flags = FUSE_BPF_OUT_ARGVAR;
+ fa->out_args[0].is_buffer = true;
+ fa->out_args[0].buffer = &args->value;
+ } else {
+ fa->out_args[0].size = sizeof(args->out);
+ fa->out_args[0].value = &args->out;
+ }
+ return 0;
+}
+
+static int fuse_listxattr_backing(struct bpf_fuse_args *fa, ssize_t *out, struct dentry *dentry,
+ char *list, size_t size)
+{
+ *out = vfs_listxattr(get_fuse_dentry(dentry)->backing_path.dentry, list, size);
+
+ if (*out < 0)
+ return *out;
+
+ if (fa->flags & FUSE_BPF_OUT_ARGVAR)
+ fa->out_args[0].buffer->size = *out;
+ else
+ ((struct fuse_getxattr_out *)fa->out_args[0].value)->size = *out;
+
+ return 0;
+}
+
+static int fuse_listxattr_finalize(struct bpf_fuse_args *fa, ssize_t *out, struct dentry *dentry,
+ char *list, size_t size)
+{
+ struct fuse_getxattr_out *fgo;
+
+ if (fa->info.error_in)
+ return 0;
+
+ if (fa->flags & FUSE_BPF_OUT_ARGVAR) {
+ *out = fa->out_args[0].buffer->size;
+ return 0;
+ }
+
+ fgo = fa->out_args[0].value;
+ *out = fgo->size;
+ return 0;
+}
+
+int fuse_bpf_listxattr(ssize_t *out, struct inode *inode, struct dentry *dentry,
+ char *list, size_t size)
+{
+ return bpf_fuse_backing(inode, struct fuse_getxattr_args, out,
+ fuse_listxattr_initialize_in, fuse_listxattr_initialize_out,
+ fuse_listxattr_backing, fuse_listxattr_finalize,
+ dentry, list, size);
+}
+
+struct fuse_setxattr_args {
+ struct fuse_setxattr_in in;
+ struct fuse_buffer name;
+ struct fuse_buffer value;
+};
+
+static int fuse_setxattr_initialize_in(struct bpf_fuse_args *fa,
+ struct fuse_setxattr_args *args,
+ struct dentry *dentry, const char *name,
+ const void *value, size_t size, int flags)
+{
+ *args = (struct fuse_setxattr_args) {
+ .in = (struct fuse_setxattr_in) {
+ .size = size,
+ .flags = flags,
+ },
+ .name = (struct fuse_buffer) {
+ .data = (void *) name,
+ .size = strlen(name) + 1,
+ .max_size = XATTR_NAME_MAX + 1,
+ .flags = BPF_FUSE_VARIABLE_SIZE | BPF_FUSE_MUST_ALLOCATE,
+ },
+ .value =(struct fuse_buffer) {
+ .data = (void *) value,
+ .size = size,
+ .max_size = XATTR_SIZE_MAX,
+ .flags = BPF_FUSE_VARIABLE_SIZE | BPF_FUSE_MUST_ALLOCATE,
+ },
+ };
+
+ *fa = (struct bpf_fuse_args) {
+ .info = (struct bpf_fuse_meta_info) {
+ .nodeid = get_fuse_inode(dentry->d_inode)->nodeid,
+ .opcode = FUSE_SETXATTR,
+ },
+ .in_numargs = 3,
+ .in_args[0] = (struct bpf_fuse_arg) {
+ .size = sizeof(args->in),
+ .value = &args->in,
+ },
+ .in_args[1] = (struct bpf_fuse_arg) {
+ .is_buffer = true,
+ .buffer = &args->name,
+ },
+ .in_args[2] = (struct bpf_fuse_arg) {
+ .is_buffer = true,
+ .buffer = &args->value,
+ },
+ };
+
+ return 0;
+}
+
+static int fuse_setxattr_initialize_out(struct bpf_fuse_args *fa,
+ struct fuse_setxattr_args *args,
+ struct dentry *dentry, const char *name,
+ const void *value, size_t size, int flags)
+{
+ return 0;
+}
+
+static int fuse_setxattr_backing(struct bpf_fuse_args *fa, int *out, struct dentry *dentry,
+ const char *name, const void *value, size_t size,
+ int flags)
+{
+ // TODO Ensure we actually use filter values
+ *out = vfs_setxattr(&nop_mnt_idmap,
+ get_fuse_dentry(dentry)->backing_path.dentry, name,
+ value, size, flags);
+ return 0;
+}
+
+static int fuse_setxattr_finalize(struct bpf_fuse_args *fa, int *out, struct dentry *dentry,
+ const char *name, const void *value, size_t size,
+ int flags)
+{
+ return 0;
+}
+
+int fuse_bpf_setxattr(int *out, struct inode *inode, struct dentry *dentry,
+ const char *name, const void *value, size_t size, int flags)
+{
+ return bpf_fuse_backing(inode, struct fuse_setxattr_args, out,
+ fuse_setxattr_initialize_in, fuse_setxattr_initialize_out,
+ fuse_setxattr_backing, fuse_setxattr_finalize,
+ dentry, name, value, size, flags);
+}
+
+static int fuse_removexattr_initialize_in(struct bpf_fuse_args *fa,
+ struct fuse_buffer *in,
+ struct dentry *dentry, const char *name)
+{
+ *in = (struct fuse_buffer) {
+ .data = (void *) name,
+ .size = strlen(name) + 1,
+ .max_size = XATTR_NAME_MAX + 1,
+ .flags = BPF_FUSE_VARIABLE_SIZE | BPF_FUSE_MUST_ALLOCATE,
+ };
+ *fa = (struct bpf_fuse_args) {
+ .info = (struct bpf_fuse_meta_info) {
+ .nodeid = get_fuse_inode(dentry->d_inode)->nodeid,
+ .opcode = FUSE_REMOVEXATTR,
+ },
+ .in_numargs = 1,
+ .in_args[0] = (struct bpf_fuse_arg) {
+ .is_buffer = true,
+ .buffer = in,
+ },
+ };
+
+ return 0;
+}
+
+static int fuse_removexattr_initialize_out(struct bpf_fuse_args *fa,
+ struct fuse_buffer *in,
+ struct dentry *dentry, const char *name)
+{
+ return 0;
+}
+
+static int fuse_removexattr_backing(struct bpf_fuse_args *fa, int *out,
+ struct dentry *dentry, const char *name)
+{
+ struct path *backing_path = &get_fuse_dentry(dentry)->backing_path;
+
+ /* TODO account for changes of the name by prefilter */
+ *out = vfs_removexattr(&nop_mnt_idmap, backing_path->dentry, name);
+ return 0;
+}
+
+static int fuse_removexattr_finalize(struct bpf_fuse_args *fa, int *out,
+ struct dentry *dentry, const char *name)
+{
+ return 0;
+}
+
+int fuse_bpf_removexattr(int *out, struct inode *inode, struct dentry *dentry, const char *name)
+{
+ return bpf_fuse_backing(inode, struct fuse_buffer, out,
+ fuse_removexattr_initialize_in, fuse_removexattr_initialize_out,
+ fuse_removexattr_backing, fuse_removexattr_finalize,
+ dentry, name);
+}
+
static inline void fuse_bpf_aio_put(struct fuse_bpf_aio_req *aio_req)
{
if (refcount_dec_and_test(&aio_req->ref))
diff --git a/fs/fuse/fuse_i.h b/fs/fuse/fuse_i.h
index 74540f308636..243a8fe0c343 100644
--- a/fs/fuse/fuse_i.h
+++ b/fs/fuse/fuse_i.h
@@ -1426,6 +1426,13 @@ int fuse_bpf_copy_file_range(ssize_t *out, struct inode *inode, struct file *fil
size_t len, unsigned int flags);
int fuse_bpf_fsync(int *out, struct inode *inode, struct file *file, loff_t start, loff_t end, int datasync);
int fuse_bpf_dir_fsync(int *out, struct inode *inode, struct file *file, loff_t start, loff_t end, int datasync);
+int fuse_bpf_getxattr(int *out, struct inode *inode, struct dentry *dentry,
+ const char *name, void *value, size_t size);
+int fuse_bpf_listxattr(ssize_t *out, struct inode *inode, struct dentry *dentry, char *list, size_t size);
+int fuse_bpf_setxattr(int *out, struct inode *inode, struct dentry *dentry,
+ const char *name, const void *value, size_t size,
+ int flags);
+int fuse_bpf_removexattr(int *out, struct inode *inode, struct dentry *dentry, const char *name);
int fuse_bpf_file_read_iter(ssize_t *out, struct inode *inode, struct kiocb *iocb, struct iov_iter *to);
int fuse_bpf_file_write_iter(ssize_t *out, struct inode *inode, struct kiocb *iocb, struct iov_iter *from);
int fuse_bpf_file_fallocate(int *out, struct inode *inode, struct file *file, int mode, loff_t offset, loff_t length);
@@ -1520,6 +1527,29 @@ static inline int fuse_bpf_dir_fsync(int *out, struct inode *inode, struct file
return 0;
}

+static inline int fuse_bpf_getxattr(int *out, struct inode *inode, struct dentry *dentry,
+ const char *name, void *value, size_t size)
+{
+ return 0;
+}
+
+static inline int fuse_bpf_listxattr(ssize_t *out, struct inode *inode, struct dentry *dentry, char *list, size_t size)
+{
+ return 0;
+}
+
+static inline int fuse_bpf_setxattr(int *out, struct inode *inode, struct dentry *dentry,
+ const char *name, const void *value, size_t size,
+ int flags)
+{
+ return 0;
+}
+
+static inline int fuse_bpf_removexattr(int *out, struct inode *inode, struct dentry *dentry, const char *name)
+{
+ return 0;
+}
+
static inline int fuse_bpf_file_read_iter(ssize_t *out, struct inode *inode, struct kiocb *iocb, struct iov_iter *to)
{
return 0;
diff --git a/fs/fuse/xattr.c b/fs/fuse/xattr.c
index 49c01559580f..d00f7dc50038 100644
--- a/fs/fuse/xattr.c
+++ b/fs/fuse/xattr.c
@@ -118,6 +118,9 @@ ssize_t fuse_listxattr(struct dentry *entry, char *list, size_t size)
if (fuse_is_bad(inode))
return -EIO;

+ if (fuse_bpf_listxattr(&ret, inode, entry, list, size))
+ return ret;
+
if (!fuse_allow_current_process(fm->fc))
return -EACCES;

@@ -182,9 +185,14 @@ static int fuse_xattr_get(const struct xattr_handler *handler,
struct dentry *dentry, struct inode *inode,
const char *name, void *value, size_t size)
{
+ int err;
+
if (fuse_is_bad(inode))
return -EIO;

+ if (fuse_bpf_getxattr(&err, inode, dentry, name, value, size))
+ return err;
+
return fuse_getxattr(inode, name, value, size);
}

@@ -194,9 +202,19 @@ static int fuse_xattr_set(const struct xattr_handler *handler,
const char *name, const void *value, size_t size,
int flags)
{
+ int err;
+ bool handled;
+
if (fuse_is_bad(inode))
return -EIO;

+ if (value)
+ handled = fuse_bpf_setxattr(&err, inode, dentry, name, value, size, flags);
+ else
+ handled = fuse_bpf_removexattr(&err, inode, dentry, name);
+ if (handled)
+ return err;
+
if (!value)
return fuse_removexattr(inode, name);

--
2.40.0.634.g4ca3ef3211-goog

2023-04-18 01:45:15

by Daniel Rosenberg

[permalink] [raw]
Subject: [RFC PATCH v3 24/37] fuse-bpf: Add symlink/link support

This adds backing support for FUSE_LINK, FUSE_READLINK, and FUSE_SYMLINK

Signed-off-by: Daniel Rosenberg <[email protected]
Signed-off-by: Paul Lawrence <[email protected]>
---
fs/fuse/backing.c | 327 ++++++++++++++++++++++++++++++++++++++++++++++
fs/fuse/dir.c | 11 ++
fs/fuse/fuse_i.h | 20 +++
3 files changed, 358 insertions(+)

diff --git a/fs/fuse/backing.c b/fs/fuse/backing.c
index eb3eb184c867..e807ae4f6f53 100644
--- a/fs/fuse/backing.c
+++ b/fs/fuse/backing.c
@@ -2502,6 +2502,125 @@ int fuse_bpf_unlink(int *out, struct inode *dir, struct dentry *entry)
dir, entry);
}

+struct fuse_link_args {
+ struct fuse_link_in in;
+ struct fuse_buffer name;
+};
+
+static int fuse_link_initialize_in(struct bpf_fuse_args *fa, struct fuse_link_args *args,
+ struct dentry *entry, struct inode *dir,
+ struct dentry *newent)
+{
+ struct inode *src_inode = entry->d_inode;
+
+ *args = (struct fuse_link_args) {
+ .in = (struct fuse_link_in) {
+ .oldnodeid = get_node_id(src_inode),
+ },
+ .name = (struct fuse_buffer) {
+ .data = (void *) newent->d_name.name,
+ .size = newent->d_name.len + 1,
+ .max_size = NAME_MAX + 1,
+ .flags = BPF_FUSE_VARIABLE_SIZE | BPF_FUSE_MUST_ALLOCATE,
+ },
+ };
+
+ *fa = (struct bpf_fuse_args) {
+ .info = (struct bpf_fuse_meta_info) {
+ .opcode = FUSE_LINK,
+ },
+ .in_numargs = 2,
+ .in_args[0] = (struct bpf_fuse_arg) {
+ .size = sizeof(args->in),
+ .value = &args->in,
+ },
+ .in_args[1] = (struct bpf_fuse_arg) {
+ .is_buffer = true,
+ .buffer = &args->name,
+ },
+ };
+
+ return 0;
+}
+
+static int fuse_link_initialize_out(struct bpf_fuse_args *fa, struct fuse_link_args *args,
+ struct dentry *entry, struct inode *dir,
+ struct dentry *newent)
+{
+ return 0;
+}
+
+static int fuse_link_backing(struct bpf_fuse_args *fa, int *out, struct dentry *entry,
+ struct inode *dir, struct dentry *newent)
+{
+ struct path backing_old_path;
+ struct path backing_new_path;
+ struct dentry *backing_dir_dentry;
+ struct inode *fuse_new_inode = NULL;
+ struct fuse_inode *fuse_dir_inode = get_fuse_inode(dir);
+ struct inode *backing_dir_inode = fuse_dir_inode->backing_inode;
+
+ *out = 0;
+ get_fuse_backing_path(entry, &backing_old_path);
+ if (!backing_old_path.dentry)
+ return -EBADF;
+
+ get_fuse_backing_path(newent, &backing_new_path);
+ if (!backing_new_path.dentry) {
+ *out = -EBADF;
+ goto err_dst_path;
+ }
+
+ backing_dir_dentry = dget_parent(backing_new_path.dentry);
+ backing_dir_inode = d_inode(backing_dir_dentry);
+
+ inode_lock_nested(backing_dir_inode, I_MUTEX_PARENT);
+ *out = vfs_link(backing_old_path.dentry, &nop_mnt_idmap,
+ backing_dir_inode, backing_new_path.dentry, NULL);
+ inode_unlock(backing_dir_inode);
+ if (*out)
+ goto out;
+
+ if (d_really_is_negative(backing_new_path.dentry) ||
+ unlikely(d_unhashed(backing_new_path.dentry))) {
+ *out = -EINVAL;
+ /**
+ * TODO: overlayfs responds to this situation with a
+ * lookupOneLen. Should we do that too?
+ */
+ goto out;
+ }
+
+ fuse_new_inode = fuse_iget_backing(dir->i_sb, fuse_dir_inode->nodeid, backing_dir_inode);
+ if (IS_ERR(fuse_new_inode)) {
+ *out = PTR_ERR(fuse_new_inode);
+ goto out;
+ }
+ d_instantiate(newent, fuse_new_inode);
+
+out:
+ dput(backing_dir_dentry);
+ path_put(&backing_new_path);
+err_dst_path:
+ path_put(&backing_old_path);
+ return *out;
+}
+
+static int fuse_link_finalize(struct bpf_fuse_args *fa, int *out, struct dentry *entry,
+ struct inode *dir, struct dentry *newent)
+{
+ return 0;
+}
+
+int fuse_bpf_link(int *out, struct inode *inode, struct dentry *entry,
+ struct inode *newdir, struct dentry *newent)
+{
+ return bpf_fuse_backing(inode, struct fuse_link_args, out,
+ fuse_link_initialize_in, fuse_link_initialize_out,
+ fuse_link_backing, fuse_link_finalize,
+ entry, newdir, newent);
+}
+
struct fuse_getattr_args {
struct fuse_getattr_in in;
struct fuse_attr_out out;
@@ -2790,6 +2909,214 @@ int fuse_bpf_statfs(int *out, struct inode *inode, struct dentry *dentry, struct
dentry, buf);
}

+struct fuse_get_link_args {
+ struct fuse_buffer name;
+ struct fuse_buffer path;
+};
+
+static int fuse_get_link_initialize_in(struct bpf_fuse_args *fa, struct fuse_get_link_args *args,
+ struct inode *inode, struct dentry *dentry,
+ struct delayed_call *callback)
+{
+ /*
+ * TODO
+ * If we want to handle changing these things, we'll need to copy
+ * the lower fs's data into our own buffer, and provide our own callback
+ * to free that buffer.
+ *
+ * Pre could change the name we're looking at
+ * postfilter can change the name we return
+ *
+ * We ought to only make that buffer if it's been requested, so leaving
+ * this unimplemented for the moment
+ */
+ *args = (struct fuse_get_link_args) {
+ .name = (struct fuse_buffer) {
+ .data = (void *) dentry->d_name.name,
+ .size = dentry->d_name.len + 1,
+ .max_size = NAME_MAX + 1,
+ .flags = BPF_FUSE_VARIABLE_SIZE | BPF_FUSE_MUST_ALLOCATE,
+ },
+ };
+
+ *fa = (struct bpf_fuse_args) {
+ .info = (struct bpf_fuse_meta_info) {
+ .opcode = FUSE_READLINK,
+ .nodeid = get_node_id(inode),
+ },
+ .in_numargs = 1,
+ .in_args[0] = (struct bpf_fuse_arg) {
+ .is_buffer = true,
+ .buffer = &args->name,
+ },
+ };
+
+ return 0;
+}
+
+static int fuse_get_link_initialize_out(struct bpf_fuse_args *fa, struct fuse_get_link_args *args,
+ struct inode *inode, struct dentry *dentry,
+ struct delayed_call *callback)
+{
+ // TODO
+#if 0
+ args->path = (struct fuse_buffer) {
+ .data = NULL,
+ .size = 0,
+ .max_size = PATH_MAX,
+ .flags = BPF_FUSE_VARIABLE_SIZE | BPF_FUSE_MUST_ALLOCATE,
+ };
+ fa->out_numargs = 1;
+ fa->out_args[0].is_buffer = true;
+ fa->out_args[0].buffer = &args->path;
+#endif
+
+ return 0;
+}
+
+static int fuse_get_link_backing(struct bpf_fuse_args *fa, const char **out,
+ struct inode *inode, struct dentry *dentry,
+ struct delayed_call *callback)
+{
+ struct path backing_path;
+
+ if (!dentry) {
+ *out = ERR_PTR(-ECHILD);
+ return PTR_ERR(*out);
+ }
+
+ get_fuse_backing_path(dentry, &backing_path);
+ if (!backing_path.dentry) {
+ *out = ERR_PTR(-ECHILD);
+ return PTR_ERR(*out);
+ }
+
+ /*
+ * TODO: If we want to do our own thing, copy the data and then call the
+ * callback
+ */
+ *out = vfs_get_link(backing_path.dentry, callback);
+
+ path_put(&backing_path);
+ return 0;
+}
+
+static int fuse_get_link_finalize(struct bpf_fuse_args *fa, const char **out,
+ struct inode *inode, struct dentry *dentry,
+ struct delayed_call *callback)
+{
+ return 0;
+}
+
+int fuse_bpf_get_link(const char **out, struct inode *inode, struct dentry *dentry,
+ struct delayed_call *callback)
+{
+ return bpf_fuse_backing(inode, struct fuse_get_link_args, out,
+ fuse_get_link_initialize_in, fuse_get_link_initialize_out,
+ fuse_get_link_backing, fuse_get_link_finalize,
+ inode, dentry, callback);
+}
+
+struct fuse_symlink_args {
+ struct fuse_buffer name;
+ struct fuse_buffer path;
+};
+
+static int fuse_symlink_initialize_in(struct bpf_fuse_args *fa, struct fuse_symlink_args *args,
+ struct inode *dir, struct dentry *entry, const char *link, int len)
+{
+ *args = (struct fuse_symlink_args) {
+ .name = (struct fuse_buffer) {
+ .data = (void *) entry->d_name.name,
+ .size = entry->d_name.len + 1,
+ .flags = BPF_FUSE_IMMUTABLE,
+ },
+ .path = (struct fuse_buffer) {
+ .data = (void *) link,
+ .size = len,
+ .max_size = PATH_MAX,
+ .flags = BPF_FUSE_VARIABLE_SIZE | BPF_FUSE_MUST_ALLOCATE,
+ },
+ };
+ *fa = (struct bpf_fuse_args) {
+ .info = (struct bpf_fuse_meta_info) {
+ .nodeid = get_node_id(dir),
+ .opcode = FUSE_SYMLINK,
+ },
+ .in_numargs = 2,
+ .in_args[0] = (struct bpf_fuse_arg) {
+ .is_buffer = true,
+ .buffer = &args->name,
+ },
+ .in_args[1] = (struct bpf_fuse_arg) {
+ .is_buffer = true,
+ .buffer = &args->path,
+ },
+ };
+
+ return 0;
+}
+
+static int fuse_symlink_initialize_out(struct bpf_fuse_args *fa, struct fuse_symlink_args *args,
+ struct inode *dir, struct dentry *entry, const char *link, int len)
+{
+ return 0;
+}
+
+static int fuse_symlink_backing(struct bpf_fuse_args *fa, int *out,
+ struct inode *dir, struct dentry *entry, const char *link, int len)
+{
+ struct fuse_inode *fuse_inode = get_fuse_inode(dir);
+ struct inode *backing_inode = fuse_inode->backing_inode;
+ struct path backing_path;
+ struct inode *inode = NULL;
+
+ *out = 0;
+ //TODO Actually deal with changing the backing entry in symlink
+ get_fuse_backing_path(entry, &backing_path);
+ if (!backing_path.dentry)
+ return -EBADF;
+
+ inode_lock_nested(backing_inode, I_MUTEX_PARENT);
+ *out = vfs_symlink(&nop_mnt_idmap, backing_inode, backing_path.dentry,
+ link);
+ inode_unlock(backing_inode);
+ if (*out)
+ goto out;
+ if (d_really_is_negative(backing_path.dentry) ||
+ unlikely(d_unhashed(backing_path.dentry))) {
+ *out = -EINVAL;
+ /**
+ * TODO: overlayfs responds to this situation with a
+ * lookupOneLen. Should we do that too?
+ */
+ goto out;
+ }
+ inode = fuse_iget_backing(dir->i_sb, fuse_inode->nodeid, backing_inode);
+ if (IS_ERR(inode)) {
+ *out = PTR_ERR(inode);
+ goto out;
+ }
+ d_instantiate(entry, inode);
+out:
+ path_put(&backing_path);
+ return *out;
+}
+
+static int fuse_symlink_finalize(struct bpf_fuse_args *fa, int *out,
+ struct inode *dir, struct dentry *entry, const char *link, int len)
+{
+ return 0;
+}
+
+int fuse_bpf_symlink(int *out, struct inode *dir, struct dentry *entry, const char *link, int len)
+{
+ return bpf_fuse_backing(dir, struct fuse_symlink_args, out,
+ fuse_symlink_initialize_in, fuse_symlink_initialize_out,
+ fuse_symlink_backing, fuse_symlink_finalize,
+ dir, entry, link, len);
+}
+
struct fuse_read_args {
struct fuse_read_in in;
struct fuse_read_out out;
diff --git a/fs/fuse/dir.c b/fs/fuse/dir.c
index 7d589241c9b0..d1c3b2bfb0b1 100644
--- a/fs/fuse/dir.c
+++ b/fs/fuse/dir.c
@@ -1013,6 +1013,10 @@ static int fuse_symlink(struct mnt_idmap *idmap, struct inode *dir,
struct fuse_mount *fm = get_fuse_mount(dir);
unsigned len = strlen(link) + 1;
FUSE_ARGS(args);
+ int err;
+
+ if (fuse_bpf_symlink(&err, dir, entry, link, len))
+ return err;

args.opcode = FUSE_SYMLINK;
args.in_numargs = 2;
@@ -1219,6 +1223,9 @@ static int fuse_link(struct dentry *entry, struct inode *newdir,
struct fuse_mount *fm = get_fuse_mount(inode);
FUSE_ARGS(args);

+ if (fuse_bpf_link(&err, inode, entry, newdir, newent))
+ return err;
+
memset(&inarg, 0, sizeof(inarg));
inarg.oldnodeid = get_node_id(inode);
args.opcode = FUSE_LINK;
@@ -1618,12 +1625,16 @@ static const char *fuse_get_link(struct dentry *dentry, struct inode *inode,
{
struct fuse_conn *fc = get_fuse_conn(inode);
struct page *page;
+ const char *out = NULL;
int err;

err = -EIO;
if (fuse_is_bad(inode))
goto out_err;

+ if (fuse_bpf_get_link(&out, inode, dentry, callback))
+ return out;
+
if (fc->cache_symlinks)
return page_get_link(dentry, inode, callback);

diff --git a/fs/fuse/fuse_i.h b/fs/fuse/fuse_i.h
index 243a8fe0c343..121d31a04e79 100644
--- a/fs/fuse/fuse_i.h
+++ b/fs/fuse/fuse_i.h
@@ -1417,6 +1417,7 @@ int fuse_bpf_rename2(int *out, struct inode *olddir, struct dentry *oldent,
int fuse_bpf_rename(int *out, struct inode *olddir, struct dentry *oldent,
struct inode *newdir, struct dentry *newent);
int fuse_bpf_unlink(int *out, struct inode *dir, struct dentry *entry);
+int fuse_bpf_link(int *out, struct inode *inode, struct dentry *entry, struct inode *dir, struct dentry *newent);
int fuse_bpf_release(int *out, struct inode *inode, struct file *file);
int fuse_bpf_releasedir(int *out, struct inode *inode, struct file *file);
int fuse_bpf_flush(int *out, struct inode *inode, struct file *file, fl_owner_t id);
@@ -1441,6 +1442,9 @@ int fuse_bpf_getattr(int *out, struct inode *inode, const struct dentry *entry,
u32 request_mask, unsigned int flags);
int fuse_bpf_setattr(int *out, struct inode *inode, struct dentry *dentry, struct iattr *attr, struct file *file);
int fuse_bpf_statfs(int *out, struct inode *inode, struct dentry *dentry, struct kstatfs *buf);
+int fuse_bpf_get_link(const char **out, struct inode *inode, struct dentry *dentry,
+ struct delayed_call *callback);
+int fuse_bpf_symlink(int *out, struct inode *dir, struct dentry *entry, const char *link, int len);
int fuse_bpf_readdir(int *out, struct inode *inode, struct file *file, struct dir_context *ctx);
int fuse_bpf_access(int *out, struct inode *inode, int mask);

@@ -1490,6 +1494,11 @@ static inline int fuse_bpf_unlink(int *out, struct inode *dir, struct dentry *en
return 0;
}

+static inline int fuse_bpf_link(int *out, struct inode *inode, struct dentry *entry, struct inode *dir, struct dentry *newent)
+{
+ return 0;
+}
+
static inline int fuse_bpf_release(int *out, struct inode *inode, struct file *file)
{
return 0;
@@ -1586,6 +1595,17 @@ static inline int fuse_bpf_statfs(int *out, struct inode *inode, struct dentry *
return 0;
}

+static inline int fuse_bpf_get_link(const char **out, struct inode *inode, struct dentry *dentry,
+ struct delayed_call *callback)
+{
+ return 0;
+}
+
+static inline int fuse_bpf_symlink(int *out, struct inode *dir, struct dentry *entry, const char *link, int len)
+{
+ return 0;
+}
+
static inline int fuse_bpf_readdir(int *out, struct inode *inode, struct file *file, struct dir_context *ctx)
{
return 0;
--
2.40.0.634.g4ca3ef3211-goog

2023-04-18 01:45:25

by Daniel Rosenberg

[permalink] [raw]
Subject: [RFC PATCH v3 34/37] WIP: fuse-bpf: add error_out

error_out field will allow differentiating between altering error code
from bpf programs, and the bpf program returning an error. TODO

Signed-off-by: Daniel Rosenberg <[email protected]>
---
include/linux/bpf_fuse.h | 1 +
1 file changed, 1 insertion(+)

diff --git a/include/linux/bpf_fuse.h b/include/linux/bpf_fuse.h
index 159b850e1b46..15646ba59c41 100644
--- a/include/linux/bpf_fuse.h
+++ b/include/linux/bpf_fuse.h
@@ -57,6 +57,7 @@ struct bpf_fuse_meta_info {
uint64_t nodeid;
uint32_t opcode;
uint32_t error_in;
+ uint32_t error_out; // TODO: struct_op programs may set this to alter reported error code
};

struct bpf_fuse_args {
--
2.40.0.634.g4ca3ef3211-goog

2023-04-18 01:45:27

by Daniel Rosenberg

[permalink] [raw]
Subject: [RFC PATCH v3 27/37] fuse-bpf: Add fuse-bpf constants

This adds constants that fuse_op programs will rely on for communicating
what action fuse should take next.

Signed-off-by: Daniel Rosenberg <[email protected]>
---
include/uapi/linux/bpf.h | 12 ++++++++++++
1 file changed, 12 insertions(+)

diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index 4b20a7269bee..6521c40875c7 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -7155,4 +7155,16 @@ struct bpf_iter_num {
__u64 __opaque[1];
} __attribute__((aligned(8)));

+/* Return Codes for Fuse BPF struct_op programs */
+#define BPF_FUSE_CONTINUE 0
+#define BPF_FUSE_USER 1
+#define BPF_FUSE_USER_PREFILTER 2
+#define BPF_FUSE_POSTFILTER 3
+#define BPF_FUSE_USER_POSTFILTER 4
+
+/* Op Code Filter values for BPF Programs */
+#define FUSE_OPCODE_FILTER 0x0ffff
+#define FUSE_PREFILTER 0x10000
+#define FUSE_POSTFILTER 0x20000
+
#endif /* _UAPI__LINUX_BPF_H__ */
--
2.40.0.634.g4ca3ef3211-goog

2023-04-18 01:45:30

by Daniel Rosenberg

[permalink] [raw]
Subject: [RFC PATCH v3 35/37] tools: Add FUSE, update bpf includes

Updates the bpf includes under tools, and adds fuse

Signed-off-by: Daniel Rosenberg <[email protected]>
---
tools/include/uapi/linux/bpf.h | 12 +
tools/include/uapi/linux/fuse.h | 1135 +++++++++++++++++++++++++++++++
2 files changed, 1147 insertions(+)
create mode 100644 tools/include/uapi/linux/fuse.h

diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h
index 4b20a7269bee..6521c40875c7 100644
--- a/tools/include/uapi/linux/bpf.h
+++ b/tools/include/uapi/linux/bpf.h
@@ -7155,4 +7155,16 @@ struct bpf_iter_num {
__u64 __opaque[1];
} __attribute__((aligned(8)));

+/* Return Codes for Fuse BPF struct_op programs */
+#define BPF_FUSE_CONTINUE 0
+#define BPF_FUSE_USER 1
+#define BPF_FUSE_USER_PREFILTER 2
+#define BPF_FUSE_POSTFILTER 3
+#define BPF_FUSE_USER_POSTFILTER 4
+
+/* Op Code Filter values for BPF Programs */
+#define FUSE_OPCODE_FILTER 0x0ffff
+#define FUSE_PREFILTER 0x10000
+#define FUSE_POSTFILTER 0x20000
+
#endif /* _UAPI__LINUX_BPF_H__ */
diff --git a/tools/include/uapi/linux/fuse.h b/tools/include/uapi/linux/fuse.h
new file mode 100644
index 000000000000..72c2190a1b0a
--- /dev/null
+++ b/tools/include/uapi/linux/fuse.h
@@ -0,0 +1,1135 @@
+/* SPDX-License-Identifier: ((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause) */
+/*
+ This file defines the kernel interface of FUSE
+ Copyright (C) 2001-2008 Miklos Szeredi <[email protected]>
+
+ This program can be distributed under the terms of the GNU GPL.
+ See the file COPYING.
+
+ This -- and only this -- header file may also be distributed under
+ the terms of the BSD Licence as follows:
+
+ Copyright (C) 2001-2007 Miklos Szeredi. All rights reserved.
+
+ Redistribution and use in source and binary forms, with or without
+ modification, are permitted provided that the following conditions
+ are met:
+ 1. Redistributions of source code must retain the above copyright
+ notice, this list of conditions and the following disclaimer.
+ 2. Redistributions in binary form must reproduce the above copyright
+ notice, this list of conditions and the following disclaimer in the
+ documentation and/or other materials provided with the distribution.
+
+ THIS SOFTWARE IS PROVIDED BY AUTHOR AND CONTRIBUTORS ``AS IS'' AND
+ ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+ IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+ ARE DISCLAIMED. IN NO EVENT SHALL AUTHOR OR CONTRIBUTORS BE LIABLE
+ FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
+ DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
+ OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
+ HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
+ LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
+ OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
+ SUCH DAMAGE.
+*/
+
+/*
+ * This file defines the kernel interface of FUSE
+ *
+ * Protocol changelog:
+ *
+ * 7.1:
+ * - add the following messages:
+ * FUSE_SETATTR, FUSE_SYMLINK, FUSE_MKNOD, FUSE_MKDIR, FUSE_UNLINK,
+ * FUSE_RMDIR, FUSE_RENAME, FUSE_LINK, FUSE_OPEN, FUSE_READ, FUSE_WRITE,
+ * FUSE_RELEASE, FUSE_FSYNC, FUSE_FLUSH, FUSE_SETXATTR, FUSE_GETXATTR,
+ * FUSE_LISTXATTR, FUSE_REMOVEXATTR, FUSE_OPENDIR, FUSE_READDIR,
+ * FUSE_RELEASEDIR
+ * - add padding to messages to accommodate 32-bit servers on 64-bit kernels
+ *
+ * 7.2:
+ * - add FOPEN_DIRECT_IO and FOPEN_KEEP_CACHE flags
+ * - add FUSE_FSYNCDIR message
+ *
+ * 7.3:
+ * - add FUSE_ACCESS message
+ * - add FUSE_CREATE message
+ * - add filehandle to fuse_setattr_in
+ *
+ * 7.4:
+ * - add frsize to fuse_kstatfs
+ * - clean up request size limit checking
+ *
+ * 7.5:
+ * - add flags and max_write to fuse_init_out
+ *
+ * 7.6:
+ * - add max_readahead to fuse_init_in and fuse_init_out
+ *
+ * 7.7:
+ * - add FUSE_INTERRUPT message
+ * - add POSIX file lock support
+ *
+ * 7.8:
+ * - add lock_owner and flags fields to fuse_release_in
+ * - add FUSE_BMAP message
+ * - add FUSE_DESTROY message
+ *
+ * 7.9:
+ * - new fuse_getattr_in input argument of GETATTR
+ * - add lk_flags in fuse_lk_in
+ * - add lock_owner field to fuse_setattr_in, fuse_read_in and fuse_write_in
+ * - add blksize field to fuse_attr
+ * - add file flags field to fuse_read_in and fuse_write_in
+ * - Add ATIME_NOW and MTIME_NOW flags to fuse_setattr_in
+ *
+ * 7.10
+ * - add nonseekable open flag
+ *
+ * 7.11
+ * - add IOCTL message
+ * - add unsolicited notification support
+ * - add POLL message and NOTIFY_POLL notification
+ *
+ * 7.12
+ * - add umask flag to input argument of create, mknod and mkdir
+ * - add notification messages for invalidation of inodes and
+ * directory entries
+ *
+ * 7.13
+ * - make max number of background requests and congestion threshold
+ * tunables
+ *
+ * 7.14
+ * - add splice support to fuse device
+ *
+ * 7.15
+ * - add store notify
+ * - add retrieve notify
+ *
+ * 7.16
+ * - add BATCH_FORGET request
+ * - FUSE_IOCTL_UNRESTRICTED shall now return with array of 'struct
+ * fuse_ioctl_iovec' instead of ambiguous 'struct iovec'
+ * - add FUSE_IOCTL_32BIT flag
+ *
+ * 7.17
+ * - add FUSE_FLOCK_LOCKS and FUSE_RELEASE_FLOCK_UNLOCK
+ *
+ * 7.18
+ * - add FUSE_IOCTL_DIR flag
+ * - add FUSE_NOTIFY_DELETE
+ *
+ * 7.19
+ * - add FUSE_FALLOCATE
+ *
+ * 7.20
+ * - add FUSE_AUTO_INVAL_DATA
+ *
+ * 7.21
+ * - add FUSE_READDIRPLUS
+ * - send the requested events in POLL request
+ *
+ * 7.22
+ * - add FUSE_ASYNC_DIO
+ *
+ * 7.23
+ * - add FUSE_WRITEBACK_CACHE
+ * - add time_gran to fuse_init_out
+ * - add reserved space to fuse_init_out
+ * - add FATTR_CTIME
+ * - add ctime and ctimensec to fuse_setattr_in
+ * - add FUSE_RENAME2 request
+ * - add FUSE_NO_OPEN_SUPPORT flag
+ *
+ * 7.24
+ * - add FUSE_LSEEK for SEEK_HOLE and SEEK_DATA support
+ *
+ * 7.25
+ * - add FUSE_PARALLEL_DIROPS
+ *
+ * 7.26
+ * - add FUSE_HANDLE_KILLPRIV
+ * - add FUSE_POSIX_ACL
+ *
+ * 7.27
+ * - add FUSE_ABORT_ERROR
+ *
+ * 7.28
+ * - add FUSE_COPY_FILE_RANGE
+ * - add FOPEN_CACHE_DIR
+ * - add FUSE_MAX_PAGES, add max_pages to init_out
+ * - add FUSE_CACHE_SYMLINKS
+ *
+ * 7.29
+ * - add FUSE_NO_OPENDIR_SUPPORT flag
+ *
+ * 7.30
+ * - add FUSE_EXPLICIT_INVAL_DATA
+ * - add FUSE_IOCTL_COMPAT_X32
+ *
+ * 7.31
+ * - add FUSE_WRITE_KILL_PRIV flag
+ * - add FUSE_SETUPMAPPING and FUSE_REMOVEMAPPING
+ * - add map_alignment to fuse_init_out, add FUSE_MAP_ALIGNMENT flag
+ *
+ * 7.32
+ * - add flags to fuse_attr, add FUSE_ATTR_SUBMOUNT, add FUSE_SUBMOUNTS
+ *
+ * 7.33
+ * - add FUSE_HANDLE_KILLPRIV_V2, FUSE_WRITE_KILL_SUIDGID, FATTR_KILL_SUIDGID
+ * - add FUSE_OPEN_KILL_SUIDGID
+ * - extend fuse_setxattr_in, add FUSE_SETXATTR_EXT
+ * - add FUSE_SETXATTR_ACL_KILL_SGID
+ *
+ * 7.34
+ * - add FUSE_SYNCFS
+ *
+ * 7.35
+ * - add FOPEN_NOFLUSH
+ *
+ * 7.36
+ * - extend fuse_init_in with reserved fields, add FUSE_INIT_EXT init flag
+ * - add flags2 to fuse_init_in and fuse_init_out
+ * - add FUSE_SECURITY_CTX init flag
+ * - add security context to create, mkdir, symlink, and mknod requests
+ * - add FUSE_HAS_INODE_DAX, FUSE_ATTR_DAX
+ *
+ * 7.37
+ * - add FUSE_TMPFILE
+ *
+ * 7.38
+ * - add FUSE_EXPIRE_ONLY flag to fuse_notify_inval_entry
+ * - add FOPEN_PARALLEL_DIRECT_WRITES
+ * - add total_extlen to fuse_in_header
+ * - add FUSE_MAX_NR_SECCTX
+ * - add extension header
+ * - add FUSE_EXT_GROUPS
+ * - add FUSE_CREATE_SUPP_GROUP
+ */
+
+#ifndef _LINUX_FUSE_H
+#define _LINUX_FUSE_H
+
+#ifdef __KERNEL__
+#include <linux/types.h>
+#else
+#include <stdint.h>
+#endif
+
+/*
+ * Version negotiation:
+ *
+ * Both the kernel and userspace send the version they support in the
+ * INIT request and reply respectively.
+ *
+ * If the major versions match then both shall use the smallest
+ * of the two minor versions for communication.
+ *
+ * If the kernel supports a larger major version, then userspace shall
+ * reply with the major version it supports, ignore the rest of the
+ * INIT message and expect a new INIT message from the kernel with a
+ * matching major version.
+ *
+ * If the library supports a larger major version, then it shall fall
+ * back to the major protocol version sent by the kernel for
+ * communication and reply with that major version (and an arbitrary
+ * supported minor version).
+ */
+
+/** Version number of this interface */
+#define FUSE_KERNEL_VERSION 7
+
+/** Minor version number of this interface */
+#define FUSE_KERNEL_MINOR_VERSION 38
+
+/** The node ID of the root inode */
+#define FUSE_ROOT_ID 1
+
+/* Make sure all structures are padded to 64bit boundary, so 32bit
+ userspace works under 64bit kernels */
+
+struct fuse_attr {
+ uint64_t ino;
+ uint64_t size;
+ uint64_t blocks;
+ uint64_t atime;
+ uint64_t mtime;
+ uint64_t ctime;
+ uint32_t atimensec;
+ uint32_t mtimensec;
+ uint32_t ctimensec;
+ uint32_t mode;
+ uint32_t nlink;
+ uint32_t uid;
+ uint32_t gid;
+ uint32_t rdev;
+ uint32_t blksize;
+ uint32_t flags;
+};
+
+struct fuse_kstatfs {
+ uint64_t blocks;
+ uint64_t bfree;
+ uint64_t bavail;
+ uint64_t files;
+ uint64_t ffree;
+ uint32_t bsize;
+ uint32_t namelen;
+ uint32_t frsize;
+ uint32_t padding;
+ uint32_t spare[6];
+};
+
+struct fuse_file_lock {
+ uint64_t start;
+ uint64_t end;
+ uint32_t type;
+ uint32_t pid; /* tgid */
+};
+
+/**
+ * Bitmasks for fuse_setattr_in.valid
+ */
+#define FATTR_MODE (1 << 0)
+#define FATTR_UID (1 << 1)
+#define FATTR_GID (1 << 2)
+#define FATTR_SIZE (1 << 3)
+#define FATTR_ATIME (1 << 4)
+#define FATTR_MTIME (1 << 5)
+#define FATTR_FH (1 << 6)
+#define FATTR_ATIME_NOW (1 << 7)
+#define FATTR_MTIME_NOW (1 << 8)
+#define FATTR_LOCKOWNER (1 << 9)
+#define FATTR_CTIME (1 << 10)
+#define FATTR_KILL_SUIDGID (1 << 11)
+
+/**
+ * Flags returned by the OPEN request
+ *
+ * FOPEN_DIRECT_IO: bypass page cache for this open file
+ * FOPEN_KEEP_CACHE: don't invalidate the data cache on open
+ * FOPEN_NONSEEKABLE: the file is not seekable
+ * FOPEN_CACHE_DIR: allow caching this directory
+ * FOPEN_STREAM: the file is stream-like (no file position at all)
+ * FOPEN_NOFLUSH: don't flush data cache on close (unless FUSE_WRITEBACK_CACHE)
+ * FOPEN_PARALLEL_DIRECT_WRITES: Allow concurrent direct writes on the same inode
+ */
+#define FOPEN_DIRECT_IO (1 << 0)
+#define FOPEN_KEEP_CACHE (1 << 1)
+#define FOPEN_NONSEEKABLE (1 << 2)
+#define FOPEN_CACHE_DIR (1 << 3)
+#define FOPEN_STREAM (1 << 4)
+#define FOPEN_NOFLUSH (1 << 5)
+#define FOPEN_PARALLEL_DIRECT_WRITES (1 << 6)
+
+/**
+ * INIT request/reply flags
+ *
+ * FUSE_ASYNC_READ: asynchronous read requests
+ * FUSE_POSIX_LOCKS: remote locking for POSIX file locks
+ * FUSE_FILE_OPS: kernel sends file handle for fstat, etc... (not yet supported)
+ * FUSE_ATOMIC_O_TRUNC: handles the O_TRUNC open flag in the filesystem
+ * FUSE_EXPORT_SUPPORT: filesystem handles lookups of "." and ".."
+ * FUSE_BIG_WRITES: filesystem can handle write size larger than 4kB
+ * FUSE_DONT_MASK: don't apply umask to file mode on create operations
+ * FUSE_SPLICE_WRITE: kernel supports splice write on the device
+ * FUSE_SPLICE_MOVE: kernel supports splice move on the device
+ * FUSE_SPLICE_READ: kernel supports splice read on the device
+ * FUSE_FLOCK_LOCKS: remote locking for BSD style file locks
+ * FUSE_HAS_IOCTL_DIR: kernel supports ioctl on directories
+ * FUSE_AUTO_INVAL_DATA: automatically invalidate cached pages
+ * FUSE_DO_READDIRPLUS: do READDIRPLUS (READDIR+LOOKUP in one)
+ * FUSE_READDIRPLUS_AUTO: adaptive readdirplus
+ * FUSE_ASYNC_DIO: asynchronous direct I/O submission
+ * FUSE_WRITEBACK_CACHE: use writeback cache for buffered writes
+ * FUSE_NO_OPEN_SUPPORT: kernel supports zero-message opens
+ * FUSE_PARALLEL_DIROPS: allow parallel lookups and readdir
+ * FUSE_HANDLE_KILLPRIV: fs handles killing suid/sgid/cap on write/chown/trunc
+ * FUSE_POSIX_ACL: filesystem supports posix acls
+ * FUSE_ABORT_ERROR: reading the device after abort returns ECONNABORTED
+ * FUSE_MAX_PAGES: init_out.max_pages contains the max number of req pages
+ * FUSE_CACHE_SYMLINKS: cache READLINK responses
+ * FUSE_NO_OPENDIR_SUPPORT: kernel supports zero-message opendir
+ * FUSE_EXPLICIT_INVAL_DATA: only invalidate cached pages on explicit request
+ * FUSE_MAP_ALIGNMENT: init_out.map_alignment contains log2(byte alignment) for
+ * foffset and moffset fields in struct
+ * fuse_setupmapping_out and fuse_removemapping_one.
+ * FUSE_SUBMOUNTS: kernel supports auto-mounting directory submounts
+ * FUSE_HANDLE_KILLPRIV_V2: fs kills suid/sgid/cap on write/chown/trunc.
+ * Upon write/truncate suid/sgid is only killed if caller
+ * does not have CAP_FSETID. Additionally upon
+ * write/truncate sgid is killed only if file has group
+ * execute permission. (Same as Linux VFS behavior).
+ * FUSE_SETXATTR_EXT: Server supports extended struct fuse_setxattr_in
+ * FUSE_INIT_EXT: extended fuse_init_in request
+ * FUSE_INIT_RESERVED: reserved, do not use
+ * FUSE_SECURITY_CTX: add security context to create, mkdir, symlink, and
+ * mknod
+ * FUSE_HAS_INODE_DAX: use per inode DAX
+ * FUSE_CREATE_SUPP_GROUP: add supplementary group info to create, mkdir,
+ * symlink and mknod (single group that matches parent)
+ */
+#define FUSE_ASYNC_READ (1 << 0)
+#define FUSE_POSIX_LOCKS (1 << 1)
+#define FUSE_FILE_OPS (1 << 2)
+#define FUSE_ATOMIC_O_TRUNC (1 << 3)
+#define FUSE_EXPORT_SUPPORT (1 << 4)
+#define FUSE_BIG_WRITES (1 << 5)
+#define FUSE_DONT_MASK (1 << 6)
+#define FUSE_SPLICE_WRITE (1 << 7)
+#define FUSE_SPLICE_MOVE (1 << 8)
+#define FUSE_SPLICE_READ (1 << 9)
+#define FUSE_FLOCK_LOCKS (1 << 10)
+#define FUSE_HAS_IOCTL_DIR (1 << 11)
+#define FUSE_AUTO_INVAL_DATA (1 << 12)
+#define FUSE_DO_READDIRPLUS (1 << 13)
+#define FUSE_READDIRPLUS_AUTO (1 << 14)
+#define FUSE_ASYNC_DIO (1 << 15)
+#define FUSE_WRITEBACK_CACHE (1 << 16)
+#define FUSE_NO_OPEN_SUPPORT (1 << 17)
+#define FUSE_PARALLEL_DIROPS (1 << 18)
+#define FUSE_HANDLE_KILLPRIV (1 << 19)
+#define FUSE_POSIX_ACL (1 << 20)
+#define FUSE_ABORT_ERROR (1 << 21)
+#define FUSE_MAX_PAGES (1 << 22)
+#define FUSE_CACHE_SYMLINKS (1 << 23)
+#define FUSE_NO_OPENDIR_SUPPORT (1 << 24)
+#define FUSE_EXPLICIT_INVAL_DATA (1 << 25)
+#define FUSE_MAP_ALIGNMENT (1 << 26)
+#define FUSE_SUBMOUNTS (1 << 27)
+#define FUSE_HANDLE_KILLPRIV_V2 (1 << 28)
+#define FUSE_SETXATTR_EXT (1 << 29)
+#define FUSE_INIT_EXT (1 << 30)
+#define FUSE_INIT_RESERVED (1 << 31)
+/* bits 32..63 get shifted down 32 bits into the flags2 field */
+#define FUSE_SECURITY_CTX (1ULL << 32)
+#define FUSE_HAS_INODE_DAX (1ULL << 33)
+#define FUSE_CREATE_SUPP_GROUP (1ULL << 34)
+
+/**
+ * CUSE INIT request/reply flags
+ *
+ * CUSE_UNRESTRICTED_IOCTL: use unrestricted ioctl
+ */
+#define CUSE_UNRESTRICTED_IOCTL (1 << 0)
+
+/**
+ * Release flags
+ */
+#define FUSE_RELEASE_FLUSH (1 << 0)
+#define FUSE_RELEASE_FLOCK_UNLOCK (1 << 1)
+
+/**
+ * Getattr flags
+ */
+#define FUSE_GETATTR_FH (1 << 0)
+
+/**
+ * Lock flags
+ */
+#define FUSE_LK_FLOCK (1 << 0)
+
+/**
+ * WRITE flags
+ *
+ * FUSE_WRITE_CACHE: delayed write from page cache, file handle is guessed
+ * FUSE_WRITE_LOCKOWNER: lock_owner field is valid
+ * FUSE_WRITE_KILL_SUIDGID: kill suid and sgid bits
+ */
+#define FUSE_WRITE_CACHE (1 << 0)
+#define FUSE_WRITE_LOCKOWNER (1 << 1)
+#define FUSE_WRITE_KILL_SUIDGID (1 << 2)
+
+/* Obsolete alias; this flag implies killing suid/sgid only. */
+#define FUSE_WRITE_KILL_PRIV FUSE_WRITE_KILL_SUIDGID
+
+/**
+ * Read flags
+ */
+#define FUSE_READ_LOCKOWNER (1 << 1)
+
+/**
+ * Ioctl flags
+ *
+ * FUSE_IOCTL_COMPAT: 32bit compat ioctl on 64bit machine
+ * FUSE_IOCTL_UNRESTRICTED: not restricted to well-formed ioctls, retry allowed
+ * FUSE_IOCTL_RETRY: retry with new iovecs
+ * FUSE_IOCTL_32BIT: 32bit ioctl
+ * FUSE_IOCTL_DIR: is a directory
+ * FUSE_IOCTL_COMPAT_X32: x32 compat ioctl on 64bit machine (64bit time_t)
+ *
+ * FUSE_IOCTL_MAX_IOV: maximum of in_iovecs + out_iovecs
+ */
+#define FUSE_IOCTL_COMPAT (1 << 0)
+#define FUSE_IOCTL_UNRESTRICTED (1 << 1)
+#define FUSE_IOCTL_RETRY (1 << 2)
+#define FUSE_IOCTL_32BIT (1 << 3)
+#define FUSE_IOCTL_DIR (1 << 4)
+#define FUSE_IOCTL_COMPAT_X32 (1 << 5)
+
+#define FUSE_IOCTL_MAX_IOV 256
+
+/**
+ * Poll flags
+ *
+ * FUSE_POLL_SCHEDULE_NOTIFY: request poll notify
+ */
+#define FUSE_POLL_SCHEDULE_NOTIFY (1 << 0)
+
+/**
+ * Fsync flags
+ *
+ * FUSE_FSYNC_FDATASYNC: Sync data only, not metadata
+ */
+#define FUSE_FSYNC_FDATASYNC (1 << 0)
+
+/**
+ * fuse_attr flags
+ *
+ * FUSE_ATTR_SUBMOUNT: Object is a submount root
+ * FUSE_ATTR_DAX: Enable DAX for this file in per inode DAX mode
+ */
+#define FUSE_ATTR_SUBMOUNT (1 << 0)
+#define FUSE_ATTR_DAX (1 << 1)
+
+/**
+ * Open flags
+ * FUSE_OPEN_KILL_SUIDGID: Kill suid and sgid if executable
+ */
+#define FUSE_OPEN_KILL_SUIDGID (1 << 0)
+
+/**
+ * setxattr flags
+ * FUSE_SETXATTR_ACL_KILL_SGID: Clear SGID when system.posix_acl_access is set
+ */
+#define FUSE_SETXATTR_ACL_KILL_SGID (1 << 0)
+
+/**
+ * notify_inval_entry flags
+ * FUSE_EXPIRE_ONLY
+ */
+#define FUSE_EXPIRE_ONLY (1 << 0)
+
+/**
+ * extension type
+ * FUSE_MAX_NR_SECCTX: maximum value of &fuse_secctx_header.nr_secctx
+ * FUSE_EXT_GROUPS: &fuse_supp_groups extension
+ */
+enum fuse_ext_type {
+ /* Types 0..31 are reserved for fuse_secctx_header */
+ FUSE_MAX_NR_SECCTX = 31,
+ FUSE_EXT_GROUPS = 32,
+ FUSE_ERROR_IN = 33,
+};
+
+enum fuse_opcode {
+ FUSE_LOOKUP = 1,
+ FUSE_FORGET = 2, /* no reply */
+ FUSE_GETATTR = 3,
+ FUSE_SETATTR = 4,
+ FUSE_READLINK = 5,
+ FUSE_SYMLINK = 6,
+ FUSE_MKNOD = 8,
+ FUSE_MKDIR = 9,
+ FUSE_UNLINK = 10,
+ FUSE_RMDIR = 11,
+ FUSE_RENAME = 12,
+ FUSE_LINK = 13,
+ FUSE_OPEN = 14,
+ FUSE_READ = 15,
+ FUSE_WRITE = 16,
+ FUSE_STATFS = 17,
+ FUSE_RELEASE = 18,
+ FUSE_FSYNC = 20,
+ FUSE_SETXATTR = 21,
+ FUSE_GETXATTR = 22,
+ FUSE_LISTXATTR = 23,
+ FUSE_REMOVEXATTR = 24,
+ FUSE_FLUSH = 25,
+ FUSE_INIT = 26,
+ FUSE_OPENDIR = 27,
+ FUSE_READDIR = 28,
+ FUSE_RELEASEDIR = 29,
+ FUSE_FSYNCDIR = 30,
+ FUSE_GETLK = 31,
+ FUSE_SETLK = 32,
+ FUSE_SETLKW = 33,
+ FUSE_ACCESS = 34,
+ FUSE_CREATE = 35,
+ FUSE_INTERRUPT = 36,
+ FUSE_BMAP = 37,
+ FUSE_DESTROY = 38,
+ FUSE_IOCTL = 39,
+ FUSE_POLL = 40,
+ FUSE_NOTIFY_REPLY = 41,
+ FUSE_BATCH_FORGET = 42,
+ FUSE_FALLOCATE = 43,
+ FUSE_READDIRPLUS = 44,
+ FUSE_RENAME2 = 45,
+ FUSE_LSEEK = 46,
+ FUSE_COPY_FILE_RANGE = 47,
+ FUSE_SETUPMAPPING = 48,
+ FUSE_REMOVEMAPPING = 49,
+ FUSE_SYNCFS = 50,
+ FUSE_TMPFILE = 51,
+
+ /* CUSE specific operations */
+ CUSE_INIT = 4096,
+
+ /* Reserved opcodes: helpful to detect structure endian-ness */
+ CUSE_INIT_BSWAP_RESERVED = 1048576, /* CUSE_INIT << 8 */
+ FUSE_INIT_BSWAP_RESERVED = 436207616, /* FUSE_INIT << 24 */
+};
+
+enum fuse_notify_code {
+ FUSE_NOTIFY_POLL = 1,
+ FUSE_NOTIFY_INVAL_INODE = 2,
+ FUSE_NOTIFY_INVAL_ENTRY = 3,
+ FUSE_NOTIFY_STORE = 4,
+ FUSE_NOTIFY_RETRIEVE = 5,
+ FUSE_NOTIFY_DELETE = 6,
+ FUSE_NOTIFY_CODE_MAX,
+};
+
+/* The read buffer is required to be at least 8k, but may be much larger */
+#define FUSE_MIN_READ_BUFFER 8192
+
+#define FUSE_COMPAT_ENTRY_OUT_SIZE 120
+
+struct fuse_entry_out {
+ uint64_t nodeid; /* Inode ID */
+ uint64_t generation; /* Inode generation: nodeid:gen must
+ be unique for the fs's lifetime */
+ uint64_t entry_valid; /* Cache timeout for the name */
+ uint64_t attr_valid; /* Cache timeout for the attributes */
+ uint32_t entry_valid_nsec;
+ uint32_t attr_valid_nsec;
+ struct fuse_attr attr;
+};
+
+#define FUSE_BPF_MAX_ENTRIES 2
+
+enum fuse_bpf_type {
+ FUSE_ENTRY_BACKING = 1,
+ FUSE_ENTRY_BPF = 2,
+ FUSE_ENTRY_REMOVE_BACKING = 3,
+ FUSE_ENTRY_REMOVE_BPF = 4,
+};
+
+#define BPF_FUSE_NAME_MAX 15
+
+struct fuse_bpf_entry_out {
+ uint32_t entry_type;
+ uint32_t unused;
+ union {
+ struct {
+ uint64_t unused2;
+ uint64_t fd;
+ };
+ char name[BPF_FUSE_NAME_MAX + 1];
+ };
+};
+
+struct fuse_forget_in {
+ uint64_t nlookup;
+};
+
+struct fuse_forget_one {
+ uint64_t nodeid;
+ uint64_t nlookup;
+};
+
+struct fuse_batch_forget_in {
+ uint32_t count;
+ uint32_t dummy;
+};
+
+struct fuse_getattr_in {
+ uint32_t getattr_flags;
+ uint32_t dummy;
+ uint64_t fh;
+};
+
+#define FUSE_COMPAT_ATTR_OUT_SIZE 96
+
+struct fuse_attr_out {
+ uint64_t attr_valid; /* Cache timeout for the attributes */
+ uint32_t attr_valid_nsec;
+ uint32_t dummy;
+ struct fuse_attr attr;
+};
+
+#define FUSE_COMPAT_MKNOD_IN_SIZE 8
+
+struct fuse_mknod_in {
+ uint32_t mode;
+ uint32_t rdev;
+ uint32_t umask;
+ uint32_t padding;
+};
+
+struct fuse_mkdir_in {
+ uint32_t mode;
+ uint32_t umask;
+};
+
+struct fuse_rename_in {
+ uint64_t newdir;
+};
+
+struct fuse_rename2_in {
+ uint64_t newdir;
+ uint32_t flags;
+ uint32_t padding;
+};
+
+struct fuse_link_in {
+ uint64_t oldnodeid;
+};
+
+struct fuse_setattr_in {
+ uint32_t valid;
+ uint32_t padding;
+ uint64_t fh;
+ uint64_t size;
+ uint64_t lock_owner;
+ uint64_t atime;
+ uint64_t mtime;
+ uint64_t ctime;
+ uint32_t atimensec;
+ uint32_t mtimensec;
+ uint32_t ctimensec;
+ uint32_t mode;
+ uint32_t unused4;
+ uint32_t uid;
+ uint32_t gid;
+ uint32_t unused5;
+};
+
+struct fuse_open_in {
+ uint32_t flags;
+ uint32_t open_flags; /* FUSE_OPEN_... */
+};
+
+struct fuse_create_in {
+ uint32_t flags;
+ uint32_t mode;
+ uint32_t umask;
+ uint32_t open_flags; /* FUSE_OPEN_... */
+};
+
+struct fuse_open_out {
+ uint64_t fh;
+ uint32_t open_flags;
+ uint32_t padding;
+};
+
+struct fuse_release_in {
+ uint64_t fh;
+ uint32_t flags;
+ uint32_t release_flags;
+ uint64_t lock_owner;
+};
+
+struct fuse_flush_in {
+ uint64_t fh;
+ uint32_t unused;
+ uint32_t padding;
+ uint64_t lock_owner;
+};
+
+struct fuse_read_in {
+ uint64_t fh;
+ uint64_t offset;
+ uint32_t size;
+ uint32_t read_flags;
+ uint64_t lock_owner;
+ uint32_t flags;
+ uint32_t padding;
+};
+
+struct fuse_read_out {
+ uint64_t offset;
+ uint32_t again;
+ uint32_t padding;
+};
+
+// This is likely not what we want
+struct fuse_read_iter_out {
+ uint64_t ret;
+};
+
+#define FUSE_COMPAT_WRITE_IN_SIZE 24
+
+struct fuse_write_in {
+ uint64_t fh;
+ uint64_t offset;
+ uint32_t size;
+ uint32_t write_flags;
+ uint64_t lock_owner;
+ uint32_t flags;
+ uint32_t padding;
+};
+
+struct fuse_write_out {
+ uint32_t size;
+ uint32_t padding;
+};
+
+// This is likely not what we want
+struct fuse_write_iter_out {
+ uint64_t ret;
+};
+
+#define FUSE_COMPAT_STATFS_SIZE 48
+
+struct fuse_statfs_out {
+ struct fuse_kstatfs st;
+};
+
+struct fuse_fsync_in {
+ uint64_t fh;
+ uint32_t fsync_flags;
+ uint32_t padding;
+};
+
+#define FUSE_COMPAT_SETXATTR_IN_SIZE 8
+
+struct fuse_setxattr_in {
+ uint32_t size;
+ uint32_t flags;
+ uint32_t setxattr_flags;
+ uint32_t padding;
+};
+
+struct fuse_getxattr_in {
+ uint32_t size;
+ uint32_t padding;
+};
+
+struct fuse_getxattr_out {
+ uint32_t size;
+ uint32_t padding;
+};
+
+struct fuse_lk_in {
+ uint64_t fh;
+ uint64_t owner;
+ struct fuse_file_lock lk;
+ uint32_t lk_flags;
+ uint32_t padding;
+};
+
+struct fuse_lk_out {
+ struct fuse_file_lock lk;
+};
+
+struct fuse_access_in {
+ uint32_t mask;
+ uint32_t padding;
+};
+
+struct fuse_init_in {
+ uint32_t major;
+ uint32_t minor;
+ uint32_t max_readahead;
+ uint32_t flags;
+ uint32_t flags2;
+ uint32_t unused[11];
+};
+
+#define FUSE_COMPAT_INIT_OUT_SIZE 8
+#define FUSE_COMPAT_22_INIT_OUT_SIZE 24
+
+struct fuse_init_out {
+ uint32_t major;
+ uint32_t minor;
+ uint32_t max_readahead;
+ uint32_t flags;
+ uint16_t max_background;
+ uint16_t congestion_threshold;
+ uint32_t max_write;
+ uint32_t time_gran;
+ uint16_t max_pages;
+ uint16_t map_alignment;
+ uint32_t flags2;
+ uint32_t unused[7];
+};
+
+#define CUSE_INIT_INFO_MAX 4096
+
+struct cuse_init_in {
+ uint32_t major;
+ uint32_t minor;
+ uint32_t unused;
+ uint32_t flags;
+};
+
+struct cuse_init_out {
+ uint32_t major;
+ uint32_t minor;
+ uint32_t unused;
+ uint32_t flags;
+ uint32_t max_read;
+ uint32_t max_write;
+ uint32_t dev_major; /* chardev major */
+ uint32_t dev_minor; /* chardev minor */
+ uint32_t spare[10];
+};
+
+struct fuse_interrupt_in {
+ uint64_t unique;
+};
+
+struct fuse_bmap_in {
+ uint64_t block;
+ uint32_t blocksize;
+ uint32_t padding;
+};
+
+struct fuse_bmap_out {
+ uint64_t block;
+};
+
+struct fuse_ioctl_in {
+ uint64_t fh;
+ uint32_t flags;
+ uint32_t cmd;
+ uint64_t arg;
+ uint32_t in_size;
+ uint32_t out_size;
+};
+
+struct fuse_ioctl_iovec {
+ uint64_t base;
+ uint64_t len;
+};
+
+struct fuse_ioctl_out {
+ int32_t result;
+ uint32_t flags;
+ uint32_t in_iovs;
+ uint32_t out_iovs;
+};
+
+struct fuse_poll_in {
+ uint64_t fh;
+ uint64_t kh;
+ uint32_t flags;
+ uint32_t events;
+};
+
+struct fuse_poll_out {
+ uint32_t revents;
+ uint32_t padding;
+};
+
+struct fuse_notify_poll_wakeup_out {
+ uint64_t kh;
+};
+
+struct fuse_fallocate_in {
+ uint64_t fh;
+ uint64_t offset;
+ uint64_t length;
+ uint32_t mode;
+ uint32_t padding;
+};
+
+struct fuse_in_header {
+ uint32_t len;
+ uint32_t opcode;
+ uint64_t unique;
+ uint64_t nodeid;
+ uint32_t uid;
+ uint32_t gid;
+ uint32_t pid;
+ uint16_t total_extlen; /* length of extensions in 8byte units */
+ uint16_t padding;
+ //uint32_t error_in; uh oh
+};
+
+struct fuse_out_header {
+ uint32_t len;
+ int32_t error;
+ uint64_t unique;
+};
+
+struct fuse_dirent {
+ uint64_t ino;
+ uint64_t off;
+ uint32_t namelen;
+ uint32_t type;
+ char name[];
+};
+
+/* Align variable length records to 64bit boundary */
+#define FUSE_REC_ALIGN(x) \
+ (((x) + sizeof(uint64_t) - 1) & ~(sizeof(uint64_t) - 1))
+
+#define FUSE_NAME_OFFSET offsetof(struct fuse_dirent, name)
+#define FUSE_DIRENT_ALIGN(x) FUSE_REC_ALIGN(x)
+#define FUSE_DIRENT_SIZE(d) \
+ FUSE_DIRENT_ALIGN(FUSE_NAME_OFFSET + (d)->namelen)
+
+struct fuse_direntplus {
+ struct fuse_entry_out entry_out;
+ struct fuse_dirent dirent;
+};
+
+#define FUSE_NAME_OFFSET_DIRENTPLUS \
+ offsetof(struct fuse_direntplus, dirent.name)
+#define FUSE_DIRENTPLUS_SIZE(d) \
+ FUSE_DIRENT_ALIGN(FUSE_NAME_OFFSET_DIRENTPLUS + (d)->dirent.namelen)
+
+struct fuse_notify_inval_inode_out {
+ uint64_t ino;
+ int64_t off;
+ int64_t len;
+};
+
+struct fuse_notify_inval_entry_out {
+ uint64_t parent;
+ uint32_t namelen;
+ uint32_t flags;
+};
+
+struct fuse_notify_delete_out {
+ uint64_t parent;
+ uint64_t child;
+ uint32_t namelen;
+ uint32_t padding;
+};
+
+struct fuse_notify_store_out {
+ uint64_t nodeid;
+ uint64_t offset;
+ uint32_t size;
+ uint32_t padding;
+};
+
+struct fuse_notify_retrieve_out {
+ uint64_t notify_unique;
+ uint64_t nodeid;
+ uint64_t offset;
+ uint32_t size;
+ uint32_t padding;
+};
+
+/* Matches the size of fuse_write_in */
+struct fuse_notify_retrieve_in {
+ uint64_t dummy1;
+ uint64_t offset;
+ uint32_t size;
+ uint32_t dummy2;
+ uint64_t dummy3;
+ uint64_t dummy4;
+};
+
+/* Device ioctls: */
+#define FUSE_DEV_IOC_MAGIC 229
+#define FUSE_DEV_IOC_CLONE _IOR(FUSE_DEV_IOC_MAGIC, 0, uint32_t)
+#define FUSE_DEV_IOC_BPF_RESPONSE(N) _IOW(FUSE_DEV_IOC_MAGIC, 125, char[N])
+
+struct fuse_lseek_in {
+ uint64_t fh;
+ uint64_t offset;
+ uint32_t whence;
+ uint32_t padding;
+};
+
+struct fuse_lseek_out {
+ uint64_t offset;
+};
+
+struct fuse_copy_file_range_in {
+ uint64_t fh_in;
+ uint64_t off_in;
+ uint64_t nodeid_out;
+ uint64_t fh_out;
+ uint64_t off_out;
+ uint64_t len;
+ uint64_t flags;
+};
+
+#define FUSE_SETUPMAPPING_FLAG_WRITE (1ull << 0)
+#define FUSE_SETUPMAPPING_FLAG_READ (1ull << 1)
+struct fuse_setupmapping_in {
+ /* An already open handle */
+ uint64_t fh;
+ /* Offset into the file to start the mapping */
+ uint64_t foffset;
+ /* Length of mapping required */
+ uint64_t len;
+ /* Flags, FUSE_SETUPMAPPING_FLAG_* */
+ uint64_t flags;
+ /* Offset in Memory Window */
+ uint64_t moffset;
+};
+
+struct fuse_removemapping_in {
+ /* number of fuse_removemapping_one follows */
+ uint32_t count;
+};
+
+struct fuse_removemapping_one {
+ /* Offset into the dax window start the unmapping */
+ uint64_t moffset;
+ /* Length of mapping required */
+ uint64_t len;
+};
+
+#define FUSE_REMOVEMAPPING_MAX_ENTRY \
+ (PAGE_SIZE / sizeof(struct fuse_removemapping_one))
+
+struct fuse_syncfs_in {
+ uint64_t padding;
+};
+
+/*
+ * For each security context, send fuse_secctx with size of security context
+ * fuse_secctx will be followed by security context name and this in turn
+ * will be followed by actual context label.
+ * fuse_secctx, name, context
+ */
+struct fuse_secctx {
+ uint32_t size;
+ uint32_t padding;
+};
+
+/*
+ * Contains the information about how many fuse_secctx structures are being
+ * sent and what's the total size of all security contexts (including
+ * size of fuse_secctx_header).
+ *
+ */
+struct fuse_secctx_header {
+ uint32_t size;
+ uint32_t nr_secctx;
+};
+
+/**
+ * struct fuse_ext_header - extension header
+ * @size: total size of this extension including this header
+ * @type: type of extension
+ *
+ * This is made compatible with fuse_secctx_header by using type values >
+ * FUSE_MAX_NR_SECCTX
+ */
+struct fuse_ext_header {
+ uint32_t size;
+ uint32_t type;
+};
+
+/**
+ * struct fuse_supp_groups - Supplementary group extension
+ * @nr_groups: number of supplementary groups
+ * @groups: flexible array of group IDs
+ */
+struct fuse_supp_groups {
+ uint32_t nr_groups;
+ uint32_t groups[];
+};
+
+#endif /* _LINUX_FUSE_H */
--
2.40.0.634.g4ca3ef3211-goog

2023-04-18 01:45:44

by Daniel Rosenberg

[permalink] [raw]
Subject: [RFC PATCH v3 26/37] bpf: Increase struct_op limits

Fuse bpf goes a bit past the '64' limit here, although in reality, this
limit seems to be more like 37. After 37, we start overrunning the
safety checks while setting up the trampoline.

This simply doubles some of these values. This will have the same issue,
as we'll run out of space way before hitting the 128 limit, but for now
that unblocks fuse-bpf.

Signed-off-by: Daniel Rosenberg <[email protected]>
---
include/linux/bpf.h | 2 +-
kernel/bpf/bpf_struct_ops.c | 4 ++--
2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index 18b592fde896..c006f823e634 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -1537,7 +1537,7 @@ struct bpf_link_primer {
struct bpf_struct_ops_value;
struct btf_member;

-#define BPF_STRUCT_OPS_MAX_NR_MEMBERS 64
+#define BPF_STRUCT_OPS_MAX_NR_MEMBERS 128
struct bpf_struct_ops {
const struct bpf_verifier_ops *verifier_ops;
int (*init)(struct btf *btf);
diff --git a/kernel/bpf/bpf_struct_ops.c b/kernel/bpf/bpf_struct_ops.c
index d3f0a4825fa6..deb9eecaf1e4 100644
--- a/kernel/bpf/bpf_struct_ops.c
+++ b/kernel/bpf/bpf_struct_ops.c
@@ -417,7 +417,7 @@ static long bpf_struct_ops_map_update_elem(struct bpf_map *map, void *key,
udata = &uvalue->data;
kdata = &kvalue->data;
image = st_map->image;
- image_end = st_map->image + PAGE_SIZE;
+ image_end = st_map->image + 2 * PAGE_SIZE;

for_each_member(i, t, member) {
const struct btf_type *mtype, *ptype;
@@ -688,7 +688,7 @@ static struct bpf_map *bpf_struct_ops_map_alloc(union bpf_attr *attr)
st_map->links =
bpf_map_area_alloc(btf_type_vlen(t) * sizeof(struct bpf_links *),
NUMA_NO_NODE);
- st_map->image = bpf_jit_alloc_exec(PAGE_SIZE);
+ st_map->image = bpf_jit_alloc_exec(2 * PAGE_SIZE);
if (!st_map->uvalue || !st_map->links || !st_map->image) {
__bpf_struct_ops_map_free(map);
return ERR_PTR(-ENOMEM);
--
2.40.0.634.g4ca3ef3211-goog

2023-04-18 01:45:47

by Daniel Rosenberg

[permalink] [raw]
Subject: [RFC PATCH v3 28/37] WIP: bpf: Add fuse_ops struct_op programs

This introduces a new struct_op type: fuse_ops. This program set
provides pre and post filters to run around fuse-bpf calls that act
directly on the lower filesystem.

The inputs are either fixed structures, or struct fuse_buffer's.

These programs are not permitted to make any changes to these fuse_buffers
unless they create a dynptr wrapper using the supplied kfunc helpers.

Fuse_buffers maintain additional state information that FUSE uses to
manage memory and determine if additional set up or checks are needed.

Signed-off-by: Daniel Rosenberg <[email protected]>
---
include/linux/bpf_fuse.h | 189 +++++++++++++++++++++++
kernel/bpf/Makefile | 4 +
kernel/bpf/bpf_fuse.c | 241 ++++++++++++++++++++++++++++++
kernel/bpf/bpf_struct_ops_types.h | 4 +
kernel/bpf/btf.c | 1 +
kernel/bpf/verifier.c | 9 ++
6 files changed, 448 insertions(+)
create mode 100644 kernel/bpf/bpf_fuse.c

diff --git a/include/linux/bpf_fuse.h b/include/linux/bpf_fuse.h
index ce8b1b347496..780a7889aea2 100644
--- a/include/linux/bpf_fuse.h
+++ b/include/linux/bpf_fuse.h
@@ -30,6 +30,8 @@ struct fuse_buffer {
#define BPF_FUSE_MODIFIED (1 << 3) // The helper function allowed writes to the buffer
#define BPF_FUSE_ALLOCATED (1 << 4) // The helper function allocated the buffer

+extern void *bpf_fuse_get_writeable(struct fuse_buffer *arg, u64 size, bool copy);
+
/*
* BPF Fuse Args
*
@@ -81,4 +83,191 @@ static inline unsigned bpf_fuse_arg_size(const struct bpf_fuse_arg *arg)
return arg->is_buffer ? arg->buffer->size : arg->size;
}

+struct fuse_ops {
+ uint32_t (*open_prefilter)(const struct bpf_fuse_meta_info *meta,
+ struct fuse_open_in *in);
+ uint32_t (*open_postfilter)(const struct bpf_fuse_meta_info *meta,
+ const struct fuse_open_in *in,
+ struct fuse_open_out *out);
+
+ uint32_t (*opendir_prefilter)(const struct bpf_fuse_meta_info *meta,
+ struct fuse_open_in *in);
+ uint32_t (*opendir_postfilter)(const struct bpf_fuse_meta_info *meta,
+ const struct fuse_open_in *in,
+ struct fuse_open_out *out);
+
+ uint32_t (*create_open_prefilter)(const struct bpf_fuse_meta_info *meta,
+ struct fuse_create_in *in, struct fuse_buffer *name);
+ uint32_t (*create_open_postfilter)(const struct bpf_fuse_meta_info *meta,
+ const struct fuse_create_in *in, const struct fuse_buffer *name,
+ struct fuse_entry_out *entry_out, struct fuse_open_out *out);
+
+ uint32_t (*release_prefilter)(const struct bpf_fuse_meta_info *meta,
+ struct fuse_release_in *in);
+ uint32_t (*release_postfilter)(const struct bpf_fuse_meta_info *meta,
+ const struct fuse_release_in *in);
+
+ uint32_t (*releasedir_prefilter)(const struct bpf_fuse_meta_info *meta,
+ struct fuse_release_in *in);
+ uint32_t (*releasedir_postfilter)(const struct bpf_fuse_meta_info *meta,
+ const struct fuse_release_in *in);
+
+ uint32_t (*flush_prefilter)(const struct bpf_fuse_meta_info *meta,
+ struct fuse_flush_in *in);
+ uint32_t (*flush_postfilter)(const struct bpf_fuse_meta_info *meta,
+ const struct fuse_flush_in *in);
+
+ uint32_t (*lseek_prefilter)(const struct bpf_fuse_meta_info *meta,
+ struct fuse_lseek_in *in);
+ uint32_t (*lseek_postfilter)(const struct bpf_fuse_meta_info *meta,
+ const struct fuse_lseek_in *in,
+ struct fuse_lseek_out *out);
+
+ uint32_t (*copy_file_range_prefilter)(const struct bpf_fuse_meta_info *meta,
+ struct fuse_copy_file_range_in *in);
+ uint32_t (*copy_file_range_postfilter)(const struct bpf_fuse_meta_info *meta,
+ const struct fuse_copy_file_range_in *in,
+ struct fuse_write_out *out);
+
+ uint32_t (*fsync_prefilter)(const struct bpf_fuse_meta_info *meta,
+ struct fuse_fsync_in *in);
+ uint32_t (*fsync_postfilter)(const struct bpf_fuse_meta_info *meta,
+ const struct fuse_fsync_in *in);
+
+ uint32_t (*dir_fsync_prefilter)(const struct bpf_fuse_meta_info *meta,
+ struct fuse_fsync_in *in);
+ uint32_t (*dir_fsync_postfilter)(const struct bpf_fuse_meta_info *meta,
+ const struct fuse_fsync_in *in);
+
+ uint32_t (*getxattr_prefilter)(const struct bpf_fuse_meta_info *meta,
+ struct fuse_getxattr_in *in, struct fuse_buffer *name);
+ // if in->size > 0, use value. If in->size == 0, use out.
+ uint32_t (*getxattr_postfilter)(const struct bpf_fuse_meta_info *meta,
+ const struct fuse_getxattr_in *in, const struct fuse_buffer *name,
+ struct fuse_buffer *value, struct fuse_getxattr_out *out);
+
+ uint32_t (*listxattr_prefilter)(const struct bpf_fuse_meta_info *meta,
+ struct fuse_getxattr_in *in);
+ // if in->size > 0, use value. If in->size == 0, use out.
+ uint32_t (*listxattr_postfilter)(const struct bpf_fuse_meta_info *meta,
+ const struct fuse_getxattr_in *in,
+ struct fuse_buffer *value, struct fuse_getxattr_out *out);
+
+ uint32_t (*setxattr_prefilter)(const struct bpf_fuse_meta_info *meta,
+ struct fuse_setxattr_in *in, struct fuse_buffer *name,
+ struct fuse_buffer *value);
+ uint32_t (*setxattr_postfilter)(const struct bpf_fuse_meta_info *meta,
+ const struct fuse_setxattr_in *in, const struct fuse_buffer *name,
+ const struct fuse_buffer *value);
+
+ uint32_t (*removexattr_prefilter)(const struct bpf_fuse_meta_info *meta,
+ struct fuse_buffer *name);
+ uint32_t (*removexattr_postfilter)(const struct bpf_fuse_meta_info *meta,
+ const struct fuse_buffer *name);
+
+ /* Read and Write iter will likely undergo some sort of change/addition to handle changing
+ * the data buffer passed in/out. */
+ uint32_t (*read_iter_prefilter)(const struct bpf_fuse_meta_info *meta,
+ struct fuse_read_in *in);
+ uint32_t (*read_iter_postfilter)(const struct bpf_fuse_meta_info *meta,
+ const struct fuse_read_in *in,
+ struct fuse_read_iter_out *out);
+
+ uint32_t (*write_iter_prefilter)(const struct bpf_fuse_meta_info *meta,
+ struct fuse_write_in *in);
+ uint32_t (*write_iter_postfilter)(const struct bpf_fuse_meta_info *meta,
+ const struct fuse_write_in *in,
+ struct fuse_write_iter_out *out);
+
+ uint32_t (*file_fallocate_prefilter)(const struct bpf_fuse_meta_info *meta,
+ struct fuse_fallocate_in *in);
+ uint32_t (*file_fallocate_postfilter)(const struct bpf_fuse_meta_info *meta,
+ const struct fuse_fallocate_in *in);
+
+ uint32_t (*lookup_prefilter)(const struct bpf_fuse_meta_info *meta,
+ struct fuse_buffer *name);
+ uint32_t (*lookup_postfilter)(const struct bpf_fuse_meta_info *meta,
+ const struct fuse_buffer *name,
+ struct fuse_entry_out *out, struct fuse_buffer *entries);
+
+ uint32_t (*mknod_prefilter)(const struct bpf_fuse_meta_info *meta,
+ struct fuse_mknod_in *in, struct fuse_buffer *name);
+ uint32_t (*mknod_postfilter)(const struct bpf_fuse_meta_info *meta,
+ const struct fuse_mknod_in *in, const struct fuse_buffer *name);
+
+ uint32_t (*mkdir_prefilter)(const struct bpf_fuse_meta_info *meta,
+ struct fuse_mkdir_in *in, struct fuse_buffer *name);
+ uint32_t (*mkdir_postfilter)(const struct bpf_fuse_meta_info *meta,
+ const struct fuse_mkdir_in *in, const struct fuse_buffer *name);
+
+ uint32_t (*rmdir_prefilter)(const struct bpf_fuse_meta_info *meta,
+ struct fuse_buffer *name);
+ uint32_t (*rmdir_postfilter)(const struct bpf_fuse_meta_info *meta,
+ const struct fuse_buffer *name);
+
+ uint32_t (*rename2_prefilter)(const struct bpf_fuse_meta_info *meta,
+ struct fuse_rename2_in *in, struct fuse_buffer *old_name,
+ struct fuse_buffer *new_name);
+ uint32_t (*rename2_postfilter)(const struct bpf_fuse_meta_info *meta,
+ const struct fuse_rename2_in *in, const struct fuse_buffer *old_name,
+ const struct fuse_buffer *new_name);
+
+ uint32_t (*rename_prefilter)(const struct bpf_fuse_meta_info *meta,
+ struct fuse_rename_in *in, struct fuse_buffer *old_name,
+ struct fuse_buffer *new_name);
+ uint32_t (*rename_postfilter)(const struct bpf_fuse_meta_info *meta,
+ const struct fuse_rename_in *in, const struct fuse_buffer *old_name,
+ const struct fuse_buffer *new_name);
+
+ uint32_t (*unlink_prefilter)(const struct bpf_fuse_meta_info *meta,
+ struct fuse_buffer *name);
+ uint32_t (*unlink_postfilter)(const struct bpf_fuse_meta_info *meta,
+ const struct fuse_buffer *name);
+
+ uint32_t (*link_prefilter)(const struct bpf_fuse_meta_info *meta,
+ struct fuse_link_in *in, struct fuse_buffer *name);
+ uint32_t (*link_postfilter)(const struct bpf_fuse_meta_info *meta,
+ const struct fuse_link_in *in, const struct fuse_buffer *name);
+
+ uint32_t (*getattr_prefilter)(const struct bpf_fuse_meta_info *meta,
+ struct fuse_getattr_in *in);
+ uint32_t (*getattr_postfilter)(const struct bpf_fuse_meta_info *meta,
+ const struct fuse_getattr_in *in,
+ struct fuse_attr_out *out);
+
+ uint32_t (*setattr_prefilter)(const struct bpf_fuse_meta_info *meta,
+ struct fuse_setattr_in *in);
+ uint32_t (*setattr_postfilter)(const struct bpf_fuse_meta_info *meta,
+ const struct fuse_setattr_in *in,
+ struct fuse_attr_out *out);
+
+ uint32_t (*statfs_prefilter)(const struct bpf_fuse_meta_info *meta);
+ uint32_t (*statfs_postfilter)(const struct bpf_fuse_meta_info *meta,
+ struct fuse_statfs_out *out);
+
+ //TODO: This does not allow doing anything with path
+ uint32_t (*get_link_prefilter)(const struct bpf_fuse_meta_info *meta,
+ struct fuse_buffer *name);
+ uint32_t (*get_link_postfilter)(const struct bpf_fuse_meta_info *meta,
+ const struct fuse_buffer *name);
+
+ uint32_t (*symlink_prefilter)(const struct bpf_fuse_meta_info *meta,
+ struct fuse_buffer *name, struct fuse_buffer *path);
+ uint32_t (*symlink_postfilter)(const struct bpf_fuse_meta_info *meta,
+ const struct fuse_buffer *name, const struct fuse_buffer *path);
+
+ uint32_t (*readdir_prefilter)(const struct bpf_fuse_meta_info *meta,
+ struct fuse_read_in *in);
+ uint32_t (*readdir_postfilter)(const struct bpf_fuse_meta_info *meta,
+ const struct fuse_read_in *in,
+ struct fuse_read_out *out, struct fuse_buffer *buffer);
+
+ uint32_t (*access_prefilter)(const struct bpf_fuse_meta_info *meta,
+ struct fuse_access_in *in);
+ uint32_t (*access_postfilter)(const struct bpf_fuse_meta_info *meta,
+ const struct fuse_access_in *in);
+
+ char name[BPF_FUSE_NAME_MAX];
+};
+
#endif /* _BPF_FUSE_H */
diff --git a/kernel/bpf/Makefile b/kernel/bpf/Makefile
index 1d3892168d32..26a2e741ef61 100644
--- a/kernel/bpf/Makefile
+++ b/kernel/bpf/Makefile
@@ -45,3 +45,7 @@ obj-$(CONFIG_BPF_PRELOAD) += preload/
obj-$(CONFIG_BPF_SYSCALL) += relo_core.o
$(obj)/relo_core.o: $(srctree)/tools/lib/bpf/relo_core.c FORCE
$(call if_changed_rule,cc_o_c)
+
+ifeq ($(CONFIG_FUSE_BPF),y)
+obj-$(CONFIG_BPF_SYSCALL) += bpf_fuse.o
+endif
diff --git a/kernel/bpf/bpf_fuse.c b/kernel/bpf/bpf_fuse.c
new file mode 100644
index 000000000000..35125c1f8eef
--- /dev/null
+++ b/kernel/bpf/bpf_fuse.c
@@ -0,0 +1,241 @@
+// SPDX-License-Identifier: GPL-2.0
+// Copyright (c) 2021 Google LLC
+
+#include <linux/filter.h>
+#include <linux/bpf.h>
+#include <linux/bpf_fuse.h>
+#include <linux/bpf_verifier.h>
+#include <linux/btf.h>
+
+void *bpf_fuse_get_writeable(struct fuse_buffer *arg, u64 size, bool copy)
+{
+ void *writeable_val;
+
+ if (arg->flags & BPF_FUSE_IMMUTABLE)
+ return 0;
+
+ if (size <= arg->size &&
+ (!(arg->flags & BPF_FUSE_MUST_ALLOCATE) ||
+ (arg->flags & BPF_FUSE_ALLOCATED))) {
+ if (arg->flags & BPF_FUSE_VARIABLE_SIZE)
+ arg->size = size;
+ arg->flags |= BPF_FUSE_MODIFIED;
+ return arg->data;
+ }
+ /* Variable sized arrays must stay below max size. If the buffer must be fixed size,
+ * don't change the allocated size. Verifier will enforce requested size for accesses
+ */
+ if (arg->flags & BPF_FUSE_VARIABLE_SIZE) {
+ if (size > arg->max_size)
+ return 0;
+ } else {
+ if (size > arg->size)
+ return 0;
+ size = arg->size;
+ }
+
+ if (size != arg->size && size > arg->max_size)
+ return 0;
+
+ /* If our buffer is big enough, just adjust size */
+ if (size <= arg->alloc_size) {
+ if (!copy)
+ arg->size = size;
+ arg->flags |= BPF_FUSE_MODIFIED;
+ return arg->data;
+ }
+
+ writeable_val = kzalloc(size, GFP_KERNEL);
+ if (!writeable_val)
+ return 0;
+
+ arg->alloc_size = size;
+ /* If we're copying the buffer, assume the same amount is used. If that isn't the case,
+ * caller must change size. Otherwise, assume entirety of new buffer is used.
+ */
+ if (copy)
+ memcpy(writeable_val, arg->data, (arg->size > size) ? size : arg->size);
+ else
+ arg->size = size;
+
+ if (arg->flags & BPF_FUSE_ALLOCATED)
+ kfree(arg->data);
+ arg->data = writeable_val;
+
+ arg->flags |= BPF_FUSE_ALLOCATED | BPF_FUSE_MODIFIED;
+
+ return arg->data;
+}
+EXPORT_SYMBOL(bpf_fuse_get_writeable);
+
+__diag_push();
+__diag_ignore_all("-Wmissing-prototypes",
+ "Global kfuncs as their definitions will be in BTF");
+void bpf_fuse_get_rw_dynptr(struct fuse_buffer *buffer, struct bpf_dynptr_kern *dynptr__uninit, u64 size, bool copy)
+{
+ buffer->data = bpf_fuse_get_writeable(buffer, size, copy);
+ bpf_dynptr_init(dynptr__uninit, buffer->data, BPF_DYNPTR_TYPE_LOCAL, 0, buffer->size);
+}
+
+void bpf_fuse_get_ro_dynptr(const struct fuse_buffer *buffer, struct bpf_dynptr_kern *dynptr__uninit)
+{
+ bpf_dynptr_init(dynptr__uninit, buffer->data, BPF_DYNPTR_TYPE_LOCAL, 0, buffer->size);
+ bpf_dynptr_set_rdonly(dynptr__uninit);
+}
+
+uint32_t bpf_fuse_return_len(struct fuse_buffer *buffer)
+{
+ return buffer->size;
+}
+__diag_pop();
+BTF_SET8_START(fuse_kfunc_set)
+BTF_ID_FLAGS(func, bpf_fuse_get_rw_dynptr)
+BTF_ID_FLAGS(func, bpf_fuse_get_ro_dynptr)
+BTF_ID_FLAGS(func, bpf_fuse_return_len)
+BTF_SET8_END(fuse_kfunc_set)
+
+static const struct btf_kfunc_id_set bpf_fuse_kfunc_set = {
+ .owner = THIS_MODULE,
+ .set = &fuse_kfunc_set,
+};
+
+static int __init bpf_fuse_kfuncs_init(void)
+{
+ return register_btf_kfunc_id_set(BPF_PROG_TYPE_STRUCT_OPS,
+ &bpf_fuse_kfunc_set);
+}
+
+late_initcall(bpf_fuse_kfuncs_init);
+
+static const struct bpf_func_proto *bpf_fuse_get_func_proto(enum bpf_func_id func_id,
+ const struct bpf_prog *prog)
+{
+ switch (func_id) {
+ default:
+ return bpf_base_func_proto(func_id);
+ }
+}
+
+static bool bpf_fuse_is_valid_access(int off, int size,
+ enum bpf_access_type type,
+ const struct bpf_prog *prog,
+ struct bpf_insn_access_aux *info)
+{
+ return bpf_tracing_btf_ctx_access(off, size, type, prog, info);
+}
+
+const struct btf_type *fuse_buffer_struct_type;
+
+static int bpf_fuse_btf_struct_access(struct bpf_verifier_log *log,
+ const struct bpf_reg_state *reg,
+ int off, int size)
+{
+ const struct btf_type *t;
+
+ t = btf_type_by_id(reg->btf, reg->btf_id);
+ if (t == fuse_buffer_struct_type) {
+ bpf_log(log,
+ "direct access to fuse_buffer is disallowed\n");
+ return -EACCES;
+ }
+
+ return 0;
+}
+
+static const struct bpf_verifier_ops bpf_fuse_verifier_ops = {
+ .get_func_proto = bpf_fuse_get_func_proto,
+ .is_valid_access = bpf_fuse_is_valid_access,
+ .btf_struct_access = bpf_fuse_btf_struct_access,
+};
+
+static int bpf_fuse_check_member(const struct btf_type *t,
+ const struct btf_member *member,
+ const struct bpf_prog *prog)
+{
+ //if (is_unsupported(__btf_member_bit_offset(t, member) / 8))
+ // return -ENOTSUPP;
+ return 0;
+}
+
+static int bpf_fuse_init_member(const struct btf_type *t,
+ const struct btf_member *member,
+ void *kdata, const void *udata)
+{
+ const struct fuse_ops *uf_ops;
+ struct fuse_ops *f_ops;
+ u32 moff;
+
+ uf_ops = (const struct fuse_ops *)udata;
+ f_ops = (struct fuse_ops *)kdata;
+
+ moff = __btf_member_bit_offset(t, member) / 8;
+ switch (moff) {
+ case offsetof(struct fuse_ops, name):
+ if (bpf_obj_name_cpy(f_ops->name, uf_ops->name,
+ sizeof(f_ops->name)) <= 0)
+ return -EINVAL;
+ //if (tcp_ca_find(utcp_ca->name))
+ // return -EEXIST;
+ return 1;
+ }
+
+ return 0;
+}
+
+static int bpf_fuse_init(struct btf *btf)
+{
+ s32 type_id;
+
+ type_id = btf_find_by_name_kind(btf, "fuse_buffer", BTF_KIND_STRUCT);
+ if (type_id < 0)
+ return -EINVAL;
+ fuse_buffer_struct_type = btf_type_by_id(btf, type_id);
+
+ return 0;
+}
+
+static struct bpf_fuse_ops_attach *fuse_reg = NULL;
+
+static int bpf_fuse_reg(void *kdata)
+{
+ if (fuse_reg)
+ return fuse_reg->fuse_register_bpf(kdata);
+ pr_warn("Cannot register fuse_ops, FUSE not found");
+ return -EOPNOTSUPP;
+}
+
+static void bpf_fuse_unreg(void *kdata)
+{
+ if(fuse_reg)
+ return fuse_reg->fuse_unregister_bpf(kdata);
+}
+
+int register_fuse_bpf(struct bpf_fuse_ops_attach *reg_ops)
+{
+ fuse_reg = reg_ops;
+ return 0;
+}
+EXPORT_SYMBOL_GPL(register_fuse_bpf);
+
+void unregister_fuse_bpf(struct bpf_fuse_ops_attach *reg_ops)
+{
+ if (reg_ops == fuse_reg)
+ fuse_reg = NULL;
+ else
+ pr_warn("Refusing to unregister unregistered FUSE");
+}
+EXPORT_SYMBOL_GPL(unregister_fuse_bpf);
+
+/* "extern" is to avoid sparse warning. It is only used in bpf_struct_ops.c. */
+extern struct bpf_struct_ops bpf_fuse_ops;
+
+struct bpf_struct_ops bpf_fuse_ops = {
+ .verifier_ops = &bpf_fuse_verifier_ops,
+ .reg = bpf_fuse_reg,
+ .unreg = bpf_fuse_unreg,
+ .check_member = bpf_fuse_check_member,
+ .init_member = bpf_fuse_init_member,
+ .init = bpf_fuse_init,
+ .name = "fuse_ops",
+};
+
diff --git a/kernel/bpf/bpf_struct_ops_types.h b/kernel/bpf/bpf_struct_ops_types.h
index 5678a9ddf817..fabb2c1a9482 100644
--- a/kernel/bpf/bpf_struct_ops_types.h
+++ b/kernel/bpf/bpf_struct_ops_types.h
@@ -5,6 +5,10 @@
#ifdef CONFIG_NET
BPF_STRUCT_OPS_TYPE(bpf_dummy_ops)
#endif
+#ifdef CONFIG_FUSE_BPF
+#include <linux/bpf_fuse.h>
+BPF_STRUCT_OPS_TYPE(fuse_ops)
+#endif
#ifdef CONFIG_INET
#include <net/tcp.h>
BPF_STRUCT_OPS_TYPE(tcp_congestion_ops)
diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c
index 027f9f8a3551..c34fd9e70039 100644
--- a/kernel/bpf/btf.c
+++ b/kernel/bpf/btf.c
@@ -25,6 +25,7 @@
#include <linux/bsearch.h>
#include <linux/kobject.h>
#include <linux/sysfs.h>
+#include <linux/bpf_fuse.h>
#include <net/sock.h>
#include "../tools/lib/bpf/relo_core.h"

diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index fd959824469d..b3bda15283c0 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -9597,6 +9597,8 @@ enum special_kfunc_type {
KF_bpf_dynptr_from_xdp,
KF_bpf_dynptr_slice,
KF_bpf_dynptr_slice_rdwr,
+ KF_bpf_fuse_get_rw_dynptr,
+ KF_bpf_fuse_get_ro_dynptr,
};

BTF_SET_START(special_kfunc_set)
@@ -9616,6 +9618,8 @@ BTF_ID(func, bpf_dynptr_from_skb)
BTF_ID(func, bpf_dynptr_from_xdp)
BTF_ID(func, bpf_dynptr_slice)
BTF_ID(func, bpf_dynptr_slice_rdwr)
+BTF_ID(func, bpf_fuse_get_rw_dynptr)
+BTF_ID(func, bpf_fuse_get_ro_dynptr)
BTF_SET_END(special_kfunc_set)

BTF_ID_LIST(special_kfunc_list)
@@ -9637,6 +9641,8 @@ BTF_ID(func, bpf_dynptr_from_skb)
BTF_ID(func, bpf_dynptr_from_xdp)
BTF_ID(func, bpf_dynptr_slice)
BTF_ID(func, bpf_dynptr_slice_rdwr)
+BTF_ID(func, bpf_fuse_get_rw_dynptr)
+BTF_ID(func, bpf_fuse_get_ro_dynptr)

static bool is_kfunc_bpf_rcu_read_lock(struct bpf_kfunc_call_arg_meta *meta)
{
@@ -10349,6 +10355,9 @@ static int check_kfunc_args(struct bpf_verifier_env *env, struct bpf_kfunc_call_
dynptr_arg_type |= DYNPTR_TYPE_SKB;
else if (meta->func_id == special_kfunc_list[KF_bpf_dynptr_from_xdp])
dynptr_arg_type |= DYNPTR_TYPE_XDP;
+ else if (meta->func_id == special_kfunc_list[KF_bpf_fuse_get_rw_dynptr] ||
+ meta->func_id == special_kfunc_list[KF_bpf_fuse_get_ro_dynptr])
+ dynptr_arg_type |= DYNPTR_TYPE_LOCAL;

ret = process_dynptr_func(env, regno, insn_idx, dynptr_arg_type);
if (ret < 0)
--
2.40.0.634.g4ca3ef3211-goog

2023-04-18 01:45:59

by Daniel Rosenberg

[permalink] [raw]
Subject: [RFC PATCH v3 29/37] fuse-bpf: Export Functions

These functions needed to be exported to build fuse as a module

Signed-off-by: Daniel Rosenberg <[email protected]>
---
kernel/bpf/bpf_struct_ops.c | 2 ++
1 file changed, 2 insertions(+)

diff --git a/kernel/bpf/bpf_struct_ops.c b/kernel/bpf/bpf_struct_ops.c
index deb9eecaf1e4..0bf727996a08 100644
--- a/kernel/bpf/bpf_struct_ops.c
+++ b/kernel/bpf/bpf_struct_ops.c
@@ -745,6 +745,7 @@ bool bpf_struct_ops_get(const void *kdata)
map = __bpf_map_inc_not_zero(&st_map->map, false);
return !IS_ERR(map);
}
+EXPORT_SYMBOL_GPL(bpf_struct_ops_get);

void bpf_struct_ops_put(const void *kdata)
{
@@ -756,6 +757,7 @@ void bpf_struct_ops_put(const void *kdata)

bpf_map_put(&st_map->map);
}
+EXPORT_SYMBOL_GPL(bpf_struct_ops_put);

static bool bpf_struct_ops_valid_to_reg(struct bpf_map *map)
{
--
2.40.0.634.g4ca3ef3211-goog

2023-04-18 01:46:02

by Daniel Rosenberg

[permalink] [raw]
Subject: [RFC PATCH v3 30/37] fuse: Provide registration functions for fuse-bpf

Fuse may be built as a module, but verifier components are not. This
provides a means for fuse-bpf to handle struct op programs once the
module is loaded.

Signed-off-by: Daniel Rosenberg <[email protected]>
---
fs/fuse/Makefile | 2 +-
fs/fuse/backing.c | 2 +
fs/fuse/bpf_register.c | 209 +++++++++++++++++++++++++++++++++++++++
fs/fuse/fuse_i.h | 26 +++++
include/linux/bpf_fuse.h | 8 ++
5 files changed, 246 insertions(+), 1 deletion(-)
create mode 100644 fs/fuse/bpf_register.c

diff --git a/fs/fuse/Makefile b/fs/fuse/Makefile
index a0853c439db2..903253db7285 100644
--- a/fs/fuse/Makefile
+++ b/fs/fuse/Makefile
@@ -9,6 +9,6 @@ obj-$(CONFIG_VIRTIO_FS) += virtiofs.o

fuse-y := dev.o dir.o file.o inode.o control.o xattr.o acl.o readdir.o ioctl.o
fuse-$(CONFIG_FUSE_DAX) += dax.o
-fuse-$(CONFIG_FUSE_BPF) += backing.o
+fuse-$(CONFIG_FUSE_BPF) += backing.o bpf_register.o

virtiofs-y := virtio_fs.o
diff --git a/fs/fuse/backing.c b/fs/fuse/backing.c
index e807ae4f6f53..898ef9e05e9d 100644
--- a/fs/fuse/backing.c
+++ b/fs/fuse/backing.c
@@ -3360,6 +3360,7 @@ int fuse_bpf_access(int *out, struct inode *inode, int mask)

int __init fuse_bpf_init(void)
{
+ init_fuse_bpf();
fuse_bpf_aio_request_cachep = kmem_cache_create("fuse_bpf_aio_req",
sizeof(struct fuse_bpf_aio_req),
0, SLAB_HWCACHE_ALIGN, NULL);
@@ -3371,5 +3372,6 @@ int __init fuse_bpf_init(void)

void __exit fuse_bpf_cleanup(void)
{
+ uninit_fuse_bpf();
kmem_cache_destroy(fuse_bpf_aio_request_cachep);
}
diff --git a/fs/fuse/bpf_register.c b/fs/fuse/bpf_register.c
new file mode 100644
index 000000000000..dfe15dcf3477
--- /dev/null
+++ b/fs/fuse/bpf_register.c
@@ -0,0 +1,209 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * FUSE-BPF: Filesystem in Userspace with BPF
+ * Copyright (c) 2021 Google LLC
+ */
+
+#include <linux/bpf_verifier.h>
+#include <linux/bpf_fuse.h>
+#include <linux/bpf.h>
+#include <linux/btf.h>
+#include <linux/hashtable.h>
+
+#include "fuse_i.h"
+
+struct fuse_ops tmp_f_op_empty = { 0 };
+struct fuse_ops *tmp_f_op = &tmp_f_op_empty;
+
+struct hashtable_entry {
+ struct hlist_node hlist;
+ struct hlist_node dlist; /* for deletion cleanup */
+ struct qstr key;
+ struct fuse_ops *ops;
+};
+
+static DEFINE_HASHTABLE(name_to_ops, 8);
+
+static unsigned int full_name_case_hash(const void *salt, const unsigned char *name, unsigned int len)
+{
+ unsigned long hash = init_name_hash(salt);
+
+ while (len--)
+ hash = partial_name_hash(tolower(*name++), hash);
+ return end_name_hash(hash);
+}
+
+static inline void qstr_init(struct qstr *q, const char *name)
+{
+ q->name = name;
+ q->len = strlen(q->name);
+ q->hash = full_name_case_hash(0, q->name, q->len);
+}
+
+static inline int qstr_copy(const struct qstr *src, struct qstr *dest)
+{
+ dest->name = kstrdup(src->name, GFP_KERNEL);
+ dest->hash_len = src->hash_len;
+ return !!dest->name;
+}
+
+static inline int qstr_eq(const struct qstr *s1, const struct qstr *s2)
+{
+ int res, r1, r2, r3;
+
+ r1 = s1->len == s2->len;
+ r2 = s1->hash == s2->hash;
+ r3 = memcmp(s1->name, s2->name, s1->len);
+ res = (s1->len == s2->len && s1->hash == s2->hash && !memcmp(s1->name, s2->name, s1->len));
+ return res;
+}
+
+static struct fuse_ops *__find_fuse_ops(const struct qstr *key)
+{
+ struct hashtable_entry *hash_cur;
+ unsigned int hash = key->hash;
+ struct fuse_ops *ret_ops;
+
+ rcu_read_lock();
+ hash_for_each_possible_rcu(name_to_ops, hash_cur, hlist, hash) {
+ if (qstr_eq(key, &hash_cur->key)) {
+ ret_ops = hash_cur->ops;
+ ret_ops = get_fuse_ops(ret_ops);
+ rcu_read_unlock();
+ return ret_ops;
+ }
+ }
+ rcu_read_unlock();
+ return NULL;
+}
+
+struct fuse_ops *get_fuse_ops(struct fuse_ops *ops)
+{
+ if (bpf_try_module_get(ops, BPF_MODULE_OWNER))
+ return ops;
+ else
+ return NULL;
+}
+
+void put_fuse_ops(struct fuse_ops *ops)
+{
+ if (ops)
+ bpf_module_put(ops, BPF_MODULE_OWNER);
+}
+
+struct fuse_ops *find_fuse_ops(const char *key)
+{
+ struct qstr q;
+
+ qstr_init(&q, key);
+ return __find_fuse_ops(&q);
+}
+
+static struct hashtable_entry *alloc_hashtable_entry(const struct qstr *key,
+ struct fuse_ops *value)
+{
+ struct hashtable_entry *ret = kzalloc(sizeof(*ret), GFP_KERNEL);
+ if (!ret)
+ return NULL;
+ INIT_HLIST_NODE(&ret->dlist);
+ INIT_HLIST_NODE(&ret->hlist);
+
+ if (!qstr_copy(key, &ret->key)) {
+ kfree(ret);
+ return NULL;
+ }
+
+ ret->ops = value;
+ return ret;
+}
+
+static int __register_fuse_op(struct fuse_ops *value)
+{
+ struct hashtable_entry *hash_cur;
+ struct hashtable_entry *new_entry;
+ struct qstr key;
+ unsigned int hash;
+
+ qstr_init(&key, value->name);
+ hash = key.hash;
+ hash_for_each_possible_rcu(name_to_ops, hash_cur, hlist, hash) {
+ if (qstr_eq(&key, &hash_cur->key)) {
+ return -EEXIST;
+ }
+ }
+ new_entry = alloc_hashtable_entry(&key, value);
+ if (!new_entry)
+ return -ENOMEM;
+ hash_add_rcu(name_to_ops, &new_entry->hlist, hash);
+ return 0;
+}
+
+static int register_fuse_op(struct fuse_ops *value)
+{
+ int err;
+
+ if (bpf_try_module_get(value, BPF_MODULE_OWNER))
+ err = __register_fuse_op(value);
+ else
+ return -EBUSY;
+
+ return err;
+}
+
+static void unregister_fuse_op(struct fuse_ops *value)
+{
+ struct hashtable_entry *hash_cur;
+ struct qstr key;
+ unsigned int hash;
+ struct hlist_node *h_t;
+ HLIST_HEAD(free_list);
+
+ qstr_init(&key, value->name);
+ hash = key.hash;
+
+ hash_for_each_possible_rcu(name_to_ops, hash_cur, hlist, hash) {
+ if (qstr_eq(&key, &hash_cur->key)) {
+ hash_del_rcu(&hash_cur->hlist);
+ hlist_add_head(&hash_cur->dlist, &free_list);
+ }
+ }
+ synchronize_rcu();
+ bpf_module_put(value, BPF_MODULE_OWNER);
+ hlist_for_each_entry_safe(hash_cur, h_t, &free_list, dlist)
+ kfree(hash_cur);
+}
+
+static void fuse_op_list_destroy(void)
+{
+ struct hashtable_entry *hash_cur;
+ struct hlist_node *h_t;
+ HLIST_HEAD(free_list);
+ int i;
+
+ //mutex_lock(&sdcardfs_super_list_lock);
+ hash_for_each_rcu(name_to_ops, i, hash_cur, hlist) {
+ hash_del_rcu(&hash_cur->hlist);
+ hlist_add_head(&hash_cur->dlist, &free_list);
+ }
+ synchronize_rcu();
+ hlist_for_each_entry_safe(hash_cur, h_t, &free_list, dlist)
+ kfree(hash_cur);
+ //mutex_unlock(&sdcardfs_super_list_lock);
+ pr_info("fuse: destroyed fuse_op list\n");
+}
+
+static struct bpf_fuse_ops_attach bpf_fuse_ops_connect = {
+ .fuse_register_bpf = &register_fuse_op,
+ .fuse_unregister_bpf = &unregister_fuse_op,
+};
+
+int init_fuse_bpf(void)
+{
+ return register_fuse_bpf(&bpf_fuse_ops_connect);
+}
+
+void uninit_fuse_bpf(void)
+{
+ unregister_fuse_bpf(&bpf_fuse_ops_connect);
+ fuse_op_list_destroy();
+}
diff --git a/fs/fuse/fuse_i.h b/fs/fuse/fuse_i.h
index 2bd45c8658e8..84c591d02e43 100644
--- a/fs/fuse/fuse_i.h
+++ b/fs/fuse/fuse_i.h
@@ -1390,6 +1390,32 @@ void fuse_file_release(struct inode *inode, struct fuse_file *ff,
unsigned int open_flags, fl_owner_t id, bool isdir);

/* backing.c */
+#ifdef CONFIG_FUSE_BPF
+struct fuse_ops *find_fuse_ops(const char *key);
+struct fuse_ops *get_fuse_ops(struct fuse_ops *ops);
+void put_fuse_ops(struct fuse_ops *ops);
+int init_fuse_bpf(void);
+void uninit_fuse_bpf(void);
+#else
+int init_fuse_bpf(void)
+{
+ return -EOPNOTSUPP;
+}
+void uninit_fuse_bpf(void)
+{
+}
+struct fuse_ops *find_fuse_ops(const char *key)
+{
+ return NULL;
+}
+struct fuse_ops *get_fuse_ops(struct fuse_ops *ops)
+{
+ return NULL;
+}
+void put_fuse_ops(struct fuse_ops *ops)
+{
+}
+#endif

enum fuse_bpf_set {
FUSE_BPF_UNCHANGED = 0,
diff --git a/include/linux/bpf_fuse.h b/include/linux/bpf_fuse.h
index 780a7889aea2..2183a7a45c92 100644
--- a/include/linux/bpf_fuse.h
+++ b/include/linux/bpf_fuse.h
@@ -270,4 +270,12 @@ struct fuse_ops {
char name[BPF_FUSE_NAME_MAX];
};

+struct bpf_fuse_ops_attach {
+ int (*fuse_register_bpf)(struct fuse_ops *f_ops);
+ void (*fuse_unregister_bpf)(struct fuse_ops *f_ops);
+};
+
+int register_fuse_bpf(struct bpf_fuse_ops_attach *reg_ops);
+void unregister_fuse_bpf(struct bpf_fuse_ops_attach *reg_ops);
+
#endif /* _BPF_FUSE_H */
--
2.40.0.634.g4ca3ef3211-goog

2023-04-18 01:46:06

by Daniel Rosenberg

[permalink] [raw]
Subject: [RFC PATCH v3 31/37] fuse-bpf: Set fuse_ops at mount or lookup time

This adds the ability to associate a fuse_op struct_op program with
inodes in fuse. This can be done at mount time at the root level, or by
inode at lookup time.

Signed-off-by: Daniel Rosenberg <[email protected]>
---
fs/fuse/backing.c | 91 +++++++++++++++++++++++++++++++++++++++++++----
fs/fuse/dir.c | 16 +++++++--
fs/fuse/fuse_i.h | 12 +++++++
fs/fuse/inode.c | 28 ++++++++++++++-
4 files changed, 138 insertions(+), 9 deletions(-)

diff --git a/fs/fuse/backing.c b/fs/fuse/backing.c
index 898ef9e05e9d..d5ba1e334e69 100644
--- a/fs/fuse/backing.c
+++ b/fs/fuse/backing.c
@@ -6,6 +6,7 @@

#include "fuse_i.h"

+#include <linux/bpf.h>
#include <linux/bpf_fuse.h>
#include <linux/fdtable.h>
#include <linux/file.h>
@@ -168,12 +169,13 @@ static void fuse_get_backing_path(struct file *file, struct path *path)

static bool has_file(int type)
{
- return type == FUSE_ENTRY_BACKING;
+ return (type == FUSE_ENTRY_BACKING);
}

/*
- * The optional fuse bpf entry lists the backing file for a particular
- * lookup. These are inherited by default.
+ * The optional fuse bpf entry lists the bpf and backing files for a particular
+ * lookup. These are inherited by default. A Bpf requires a backing file to be
+ * meaningful.
*
* In the future, we may support multiple bpfs, and multiple backing files for
* the bpf to choose between.
@@ -182,14 +184,14 @@ static bool has_file(int type)
* file. Changing only the bpf is valid, though meaningless if there isn't an
* inherited backing file.
*
- * Support for the bpf program will be added in a later patch
- *
*/
int parse_fuse_bpf_entry(struct fuse_bpf_entry *fbe, int num)
{
struct fuse_bpf_entry_out *fbeo;
+ struct fuse_ops *ops;
struct file *file;
bool has_backing = false;
+ bool has_bpf_ops = false;
int num_entries;
int err = -EINVAL;
int i;
@@ -227,6 +229,11 @@ int parse_fuse_bpf_entry(struct fuse_bpf_entry *fbe, int num)
goto out_err;
fbe->backing_action = FUSE_BPF_REMOVE;
break;
+ case FUSE_ENTRY_REMOVE_BPF:
+ if (fbe->bpf_action || i == 2)
+ goto out_err;
+ fbe->bpf_action = FUSE_BPF_REMOVE;
+ break;
case FUSE_ENTRY_BACKING:
if (fbe->backing_action)
goto out_err;
@@ -234,8 +241,17 @@ int parse_fuse_bpf_entry(struct fuse_bpf_entry *fbe, int num)
fbe->backing_action = FUSE_BPF_SET;
has_backing = true;
break;
+ case FUSE_ENTRY_BPF:
+ if (fbe->bpf_action || i == 2)
+ goto out_err;
+ ops = find_fuse_ops(fbeo->name);
+ if (!ops)
+ goto out_err;
+ has_bpf_ops = true;
+ fbe->bpf_action = FUSE_BPF_SET;
+ fbe->ops = ops;
+ break;
default:
- err = -EINVAL;
goto out_err;
}
if (has_file(fbeo->entry_type)) {
@@ -252,6 +268,10 @@ int parse_fuse_bpf_entry(struct fuse_bpf_entry *fbe, int num)
fput(file);
if (has_backing)
path_put_init(&fbe->backing_path);
+ if (has_bpf_ops) {
+ put_fuse_ops(fbe->ops);
+ fbe->ops = NULL;
+ }
return err;
}

@@ -527,6 +547,15 @@ static int fuse_create_open_backing(struct bpf_fuse_args *fa, int *out,
goto out;
}

+ if (get_fuse_inode(inode)->bpf_ops)
+ put_fuse_ops(get_fuse_inode(inode)->bpf_ops);
+ get_fuse_inode(inode)->bpf_ops = dir_fuse_inode->bpf_ops;
+ if (get_fuse_inode(inode)->bpf_ops)
+ if (!get_fuse_ops(get_fuse_inode(inode)->bpf_ops)) {
+ *out = -EINVAL;
+ goto out;
+ }
+
newent = d_splice_alias(inode, entry);
if (IS_ERR(newent)) {
*out = PTR_ERR(newent);
@@ -1842,6 +1871,52 @@ int fuse_handle_backing(struct fuse_bpf_entry *fbe, struct path *backing_path)
return 0;
}

+int fuse_handle_bpf_ops(struct fuse_bpf_entry *fbe, struct inode *parent,
+ struct fuse_ops **ops)
+{
+ struct fuse_ops *new_ops;
+
+ /* Parent isn't presented, but we want to keep
+ * Don't touch bpf program at all in this case
+ */
+ if (fbe->bpf_action == FUSE_BPF_UNCHANGED && !parent)
+ return 0;
+
+ switch (fbe->bpf_action) {
+ case FUSE_BPF_UNCHANGED: {
+ struct fuse_inode *pi = get_fuse_inode(parent);
+
+ new_ops = pi->bpf_ops;
+ if (new_ops && !get_fuse_ops(new_ops))
+ return -EINVAL;
+ break;
+ }
+
+ case FUSE_BPF_REMOVE:
+ new_ops = NULL;
+ break;
+
+ case FUSE_BPF_SET:
+ new_ops = fbe->ops;
+
+ if (!new_ops)
+ return -EINVAL;
+ break;
+
+ default:
+ return -EINVAL;
+ }
+
+ /* Cannot change existing program */
+ if (*ops) {
+ put_fuse_ops(new_ops);
+ return new_ops == *ops ? 0 : -EINVAL;
+ }
+
+ *ops = new_ops;
+ return 0;
+}
+
static int fuse_lookup_finalize(struct bpf_fuse_args *fa, struct dentry **out,
struct inode *dir, struct dentry *entry, unsigned int flags)
{
@@ -1879,6 +1954,10 @@ static int fuse_lookup_finalize(struct bpf_fuse_args *fa, struct dentry **out,
if (IS_ERR(inode))
return PTR_ERR(inode);

+ error = fuse_handle_bpf_ops(fbe, dir, &get_fuse_inode(inode)->bpf_ops);
+ if (error)
+ return error;
+
get_fuse_inode(inode)->nodeid = feo->nodeid;

*out = d_splice_alias(inode, entry);
diff --git a/fs/fuse/dir.c b/fs/fuse/dir.c
index d1c3b2bfb0b1..b7bc8260a537 100644
--- a/fs/fuse/dir.c
+++ b/fs/fuse/dir.c
@@ -10,6 +10,7 @@

#include <linux/pagemap.h>
#include <linux/file.h>
+#include <linux/filter.h>
#include <linux/fs_context.h>
#include <linux/moduleparam.h>
#include <linux/sched.h>
@@ -185,6 +186,7 @@ static bool backing_data_changed(struct fuse_inode *fi, struct dentry *entry,
{
struct path new_backing_path;
struct inode *new_backing_inode;
+ struct fuse_ops *ops = NULL;
int err;
bool ret = true;

@@ -199,9 +201,15 @@ static bool backing_data_changed(struct fuse_inode *fi, struct dentry *entry,
if (err)
goto put_inode;

- ret = (fi->backing_inode != new_backing_inode ||
- !path_equal(&get_fuse_dentry(entry)->backing_path, &new_backing_path));
+ err = fuse_handle_bpf_ops(bpf_arg, entry->d_parent->d_inode, &ops);
+ if (err)
+ goto put_bpf;

+ ret = (ops != fi->bpf_ops || fi->backing_inode != new_backing_inode ||
+ !path_equal(&get_fuse_dentry(entry)->backing_path, &new_backing_path));
+put_bpf:
+ if (ops)
+ put_fuse_ops(ops);
put_inode:
path_put(&new_backing_path);
return ret;
@@ -466,6 +474,10 @@ int fuse_lookup_name(struct super_block *sb, u64 nodeid,
*inode = fuse_iget_backing(sb, outarg->nodeid, backing_inode);
if (!*inode)
goto out_queue_forget;
+
+ err = fuse_handle_bpf_ops(&bpf_arg, NULL, &get_fuse_inode(*inode)->bpf_ops);
+ if (err)
+ goto out;
} else
#endif
{
diff --git a/fs/fuse/fuse_i.h b/fs/fuse/fuse_i.h
index 84c591d02e43..15962ab3b381 100644
--- a/fs/fuse/fuse_i.h
+++ b/fs/fuse/fuse_i.h
@@ -33,6 +33,7 @@
#include <linux/pid_namespace.h>
#include <linux/refcount.h>
#include <linux/user_namespace.h>
+#include <linux/bpf_fuse.h>
#include <linux/magic.h>

/** Default max number of pages that can be used in a single read request */
@@ -109,6 +110,12 @@ struct fuse_inode {
* If this is set, nodeid is 0.
*/
struct inode *backing_inode;
+
+ /**
+ * fuse_ops, provides handlers to run on all operations to determine
+ * whether to pass through or handle in place
+ */
+ struct fuse_ops *bpf_ops;
#endif

/** Unique ID, which identifies the inode between userspace
@@ -571,6 +578,7 @@ struct fuse_fs_context {
unsigned int max_read;
unsigned int blksize;
const char *subtype;
+ struct fuse_ops *root_ops;
struct file *root_dir;

/* DAX device, may be NULL */
@@ -1428,6 +1436,8 @@ struct fuse_bpf_entry {

enum fuse_bpf_set backing_action;
struct path backing_path;
+ enum fuse_bpf_set bpf_action;
+ struct fuse_ops *ops;
bool is_used;
};

@@ -1651,6 +1661,8 @@ static inline int fuse_bpf_access(int *out, struct inode *inode, int mask)
ssize_t fuse_backing_mmap(struct file *file, struct vm_area_struct *vma);

int fuse_handle_backing(struct fuse_bpf_entry *fbe, struct path *backing_path);
+int fuse_handle_bpf_ops(struct fuse_bpf_entry *fbe, struct inode *parent,
+ struct fuse_ops **ops);

int fuse_revalidate_backing(struct dentry *entry, unsigned int flags);

diff --git a/fs/fuse/inode.c b/fs/fuse/inode.c
index 31f34962bc9b..7fd79efbdac1 100644
--- a/fs/fuse/inode.c
+++ b/fs/fuse/inode.c
@@ -80,6 +80,7 @@ static struct inode *fuse_alloc_inode(struct super_block *sb)
fi->inval_mask = 0;
#ifdef CONFIG_FUSE_BPF
fi->backing_inode = NULL;
+ fi->bpf_ops = NULL;
#endif
fi->nodeid = 0;
fi->nlookup = 0;
@@ -125,6 +126,9 @@ static void fuse_evict_inode(struct inode *inode)

#ifdef CONFIG_FUSE_BPF
iput(fi->backing_inode);
+ if (fi->bpf_ops)
+ put_fuse_ops(fi->bpf_ops);
+ fi->bpf_ops = NULL;
#endif

truncate_inode_pages_final(&inode->i_data);
@@ -755,6 +759,7 @@ enum {
OPT_ALLOW_OTHER,
OPT_MAX_READ,
OPT_BLKSIZE,
+ OPT_ROOT_BPF,
OPT_ROOT_DIR,
OPT_NO_DAEMON,
OPT_ERR
@@ -771,6 +776,7 @@ static const struct fs_parameter_spec fuse_fs_parameters[] = {
fsparam_u32 ("max_read", OPT_MAX_READ),
fsparam_u32 ("blksize", OPT_BLKSIZE),
fsparam_string ("subtype", OPT_SUBTYPE),
+ fsparam_string ("root_bpf", OPT_ROOT_BPF),
fsparam_u32 ("root_dir", OPT_ROOT_DIR),
fsparam_flag ("no_daemon", OPT_NO_DAEMON),
{}
@@ -856,6 +862,18 @@ static int fuse_parse_param(struct fs_context *fsc, struct fs_parameter *param)
ctx->blksize = result.uint_32;
break;

+ case OPT_ROOT_BPF:
+ if (strnlen(param->string, BPF_FUSE_NAME_MAX + 1) > BPF_FUSE_NAME_MAX) {
+ return invalfc(fsc, "root_bpf name too long. Max length is %d", BPF_FUSE_NAME_MAX);
+ }
+
+ ctx->root_ops = find_fuse_ops(param->string);
+ if (IS_ERR_OR_NULL(ctx->root_ops)) {
+ ctx->root_ops = NULL;
+ return invalfc(fsc, "Unable to find bpf program");
+ }
+ break;
+
case OPT_ROOT_DIR:
ctx->root_dir = fget(result.uint_32);
if (!ctx->root_dir)
@@ -881,6 +899,8 @@ static void fuse_free_fsc(struct fs_context *fsc)
if (ctx) {
if (ctx->root_dir)
fput(ctx->root_dir);
+ if (ctx->root_ops)
+ put_fuse_ops(ctx->root_ops);
kfree(ctx->subtype);
kfree(ctx);
}
@@ -1010,6 +1030,7 @@ EXPORT_SYMBOL_GPL(fuse_conn_get);

static struct inode *fuse_get_root_inode(struct super_block *sb,
unsigned int mode,
+ struct fuse_ops *root_bpf_ops,
struct file *backing_fd)
{
struct fuse_attr attr;
@@ -1024,6 +1045,10 @@ static struct inode *fuse_get_root_inode(struct super_block *sb,
return NULL;

#ifdef CONFIG_FUSE_BPF
+ get_fuse_inode(inode)->bpf_ops = root_bpf_ops;
+ if (root_bpf_ops)
+ get_fuse_ops(root_bpf_ops);
+
if (backing_fd) {
get_fuse_inode(inode)->backing_inode = backing_fd->f_inode;
ihold(backing_fd->f_inode);
@@ -1704,7 +1729,8 @@ int fuse_fill_super_common(struct super_block *sb, struct fuse_fs_context *ctx)
fc->no_daemon = ctx->no_daemon;

err = -ENOMEM;
- root = fuse_get_root_inode(sb, ctx->rootmode, ctx->root_dir);
+ root = fuse_get_root_inode(sb, ctx->rootmode, ctx->root_ops,
+ ctx->root_dir);
sb->s_d_op = &fuse_root_dentry_operations;
root_dentry = d_make_root(root);
if (!root_dentry)
--
2.40.0.634.g4ca3ef3211-goog

2023-04-18 01:46:26

by Daniel Rosenberg

[permalink] [raw]
Subject: [RFC PATCH v3 32/37] fuse-bpf: Call bpf for pre/post filters

This allows altering input or output parameters to fuse calls that will
be handled directly by the backing filesystems. BPF programs can signal
whether the entire operation should instead go through regular fuse, or
if a postfilter call is needed.

Signed-off-by: Daniel Rosenberg <[email protected]>
---
fs/fuse/backing.c | 606 ++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 606 insertions(+)

diff --git a/fs/fuse/backing.c b/fs/fuse/backing.c
index d5ba1e334e69..9217e9f83d98 100644
--- a/fs/fuse/backing.c
+++ b/fs/fuse/backing.c
@@ -14,6 +14,27 @@
#include <linux/namei.h>
#include <linux/uio.h>

+static inline void bpf_fuse_set_in_immutable(struct bpf_fuse_args *fa)
+{
+ int i;
+
+ for (i = 0; i < FUSE_MAX_ARGS_IN; i++)
+ if (fa->in_args[i].is_buffer)
+ fa->in_args[i].buffer->flags |= BPF_FUSE_IMMUTABLE;
+}
+
+static inline void bpf_fuse_free_alloced(struct bpf_fuse_args *fa)
+{
+ int i;
+
+ for (i = 0; i < FUSE_MAX_ARGS_IN; i++)
+ if (fa->in_args[i].is_buffer && (fa->in_args[i].buffer->flags & BPF_FUSE_ALLOCATED))
+ kfree(fa->in_args[i].buffer->data);
+ for (i = 0; i < FUSE_MAX_ARGS_OUT; i++)
+ if (fa->out_args[i].is_buffer && (fa->out_args[i].buffer->flags & BPF_FUSE_ALLOCATED))
+ kfree(fa->out_args[i].buffer->data);
+}
+
/*
* expression statement to wrap the backing filter logic
* struct inode *inode: inode with bpf and backing inode
@@ -23,6 +44,10 @@
* up fa and io based on args
* void initialize_out(struct bpf_fuse_args *fa, io *in_out, args...): function that sets
* up fa and io based on args
+ * int call_prefilter(struct fuse_ops *ops, struct bpf_fuse_meta_info *meta, io *in_out): Calls
+ * the struct_op prefilter function for the given fuse op
+ * int call_prostfilter(struct fuse_ops *ops, struct bpf_fuse_meta_info *meta, io *in_out): Calls
+ * the struct_op postfilter function for the given fuse op
* int backing(struct fuse_bpf_args_internal *fa, args...): function that actually performs
* the backing io operation
* void *finalize(struct fuse_bpf_args *, args...): function that performs any final
@@ -30,13 +55,16 @@
*/
#define bpf_fuse_backing(inode, io, out, \
initialize_in, initialize_out, \
+ call_prefilter, call_postfilter, \
backing, finalize, args...) \
({ \
struct fuse_inode *fuse_inode = get_fuse_inode(inode); \
+ struct fuse_ops *fuse_ops = fuse_inode->bpf_ops; \
struct bpf_fuse_args fa = { 0 }; \
bool initialized = false; \
bool handled = false; \
ssize_t res; \
+ int bpf_next; \
io feo = { 0 }; \
int error = 0; \
\
@@ -49,16 +77,46 @@
if (error) \
break; \
\
+ fa.info.opcode |= FUSE_PREFILTER; \
+ if (fuse_ops) \
+ bpf_next = call_prefilter(fuse_ops, \
+ &fa.info, &feo); \
+ else \
+ bpf_next = BPF_FUSE_CONTINUE; \
+ if (bpf_next < 0) { \
+ error = bpf_next; \
+ break; \
+ } \
+ \
+ bpf_fuse_set_in_immutable(&fa); \
+ \
error = initialize_out(&fa, &feo, args); \
if (error) \
break; \
\
initialized = true; \
+ if (bpf_next == BPF_FUSE_USER) { \
+ handled = false; \
+ break; \
+ } \
+ \
+ fa.info.opcode &= ~FUSE_PREFILTER; \
\
error = backing(&fa, out, args); \
if (error < 0) \
fa.info.error_in = error; \
\
+ if (bpf_next == BPF_FUSE_CONTINUE) \
+ break; \
+ \
+ fa.info.opcode |= FUSE_POSTFILTER; \
+ if (bpf_next == BPF_FUSE_POSTFILTER) \
+ bpf_next = call_postfilter(fuse_ops, &fa.info, &feo);\
+ if (bpf_next < 0) { \
+ error = bpf_next; \
+ break; \
+ } \
+ \
} while (false); \
\
if (initialized && handled) { \
@@ -66,6 +124,7 @@
if (res) \
error = res; \
} \
+ bpf_fuse_free_alloced(&fa); \
\
*out = error ? _Generic((*out), \
default : \
@@ -351,6 +410,34 @@ static int fuse_open_initialize_out(struct bpf_fuse_args *fa, struct fuse_open_a
return 0;
}

+static int fuse_open_prefilter(struct fuse_ops *ops, struct bpf_fuse_meta_info *meta,
+ struct fuse_open_args *open)
+{
+ if (meta->opcode == (FUSE_OPEN | FUSE_PREFILTER)) {
+ if (ops->open_prefilter)
+ return ops->open_prefilter(meta, &open->in);
+ }
+ if (meta->opcode == (FUSE_OPENDIR | FUSE_PREFILTER)) {
+ if (ops->opendir_prefilter)
+ return ops->opendir_prefilter(meta, &open->in);
+ }
+ return BPF_FUSE_CONTINUE;
+}
+
+static int fuse_open_postfilter(struct fuse_ops *ops, struct bpf_fuse_meta_info *meta,
+ struct fuse_open_args *open)
+{
+ if (meta->opcode == (FUSE_OPEN | FUSE_POSTFILTER)) {
+ if (ops->open_postfilter)
+ return ops->open_postfilter(meta, &open->in, &open->out);
+ }
+ if (meta->opcode == (FUSE_OPENDIR | FUSE_POSTFILTER)) {
+ if (ops->opendir_postfilter)
+ return ops->opendir_postfilter(meta, &open->in, &open->out);
+ }
+ return BPF_FUSE_CONTINUE;
+}
+
static int fuse_open_backing(struct bpf_fuse_args *fa, int *out,
struct inode *inode, struct file *file, bool isdir)
{
@@ -419,6 +506,7 @@ int fuse_bpf_open(int *out, struct inode *inode, struct file *file, bool isdir)
{
return bpf_fuse_backing(inode, struct fuse_open_args, out,
fuse_open_initialize_in, fuse_open_initialize_out,
+ fuse_open_prefilter, fuse_open_postfilter,
fuse_open_backing, fuse_open_finalize,
inode, file, isdir);
}
@@ -484,6 +572,22 @@ static int fuse_create_open_initialize_out(struct bpf_fuse_args *fa, struct fuse
return 0;
}

+static int fuse_create_open_prefilter(struct fuse_ops *ops, struct bpf_fuse_meta_info *meta,
+ struct fuse_create_open_args *args)
+{
+ if (ops->create_open_prefilter)
+ return ops->create_open_prefilter(meta, &args->in, &args->name);
+ return BPF_FUSE_CONTINUE;
+}
+
+static int fuse_create_open_postfilter(struct fuse_ops *ops, struct bpf_fuse_meta_info *meta,
+ struct fuse_create_open_args *args)
+{
+ if (ops->create_open_postfilter)
+ return ops->create_open_postfilter(meta, &args->in, &args->name, &args->entry_out, &args->open_out);
+ return BPF_FUSE_CONTINUE;
+}
+
static int fuse_open_file_backing(struct inode *inode, struct file *file)
{
struct fuse_mount *fm = get_fuse_mount(inode);
@@ -586,12 +690,16 @@ static int fuse_create_open_finalize(struct bpf_fuse_args *fa, int *out,
return 0;
}

+
+
int fuse_bpf_create_open(int *out, struct inode *dir, struct dentry *entry,
struct file *file, unsigned int flags, umode_t mode)
{
return bpf_fuse_backing(dir, struct fuse_create_open_args, out,
fuse_create_open_initialize_in,
fuse_create_open_initialize_out,
+ fuse_create_open_prefilter,
+ fuse_create_open_postfilter,
fuse_create_open_backing,
fuse_create_open_finalize,
dir, entry, file, flags, mode);
@@ -652,6 +760,38 @@ static int fuse_releasedir_initialize_in(struct bpf_fuse_args *fa,
return 0;
}

+static int fuse_release_prefilter(struct fuse_ops *ops, struct bpf_fuse_meta_info *meta,
+ struct fuse_release_in *args)
+{
+ if (ops->release_prefilter)
+ return ops->release_prefilter(meta, args);
+ return BPF_FUSE_CONTINUE;
+}
+
+static int fuse_release_postfilter(struct fuse_ops *ops, struct bpf_fuse_meta_info *meta,
+ struct fuse_release_in *args)
+{
+ if (ops->release_postfilter)
+ return ops->release_postfilter(meta, args);
+ return BPF_FUSE_CONTINUE;
+}
+
+static int fuse_releasedir_prefilter(struct fuse_ops *ops, struct bpf_fuse_meta_info *meta,
+ struct fuse_release_in *args)
+{
+ if (ops->releasedir_prefilter)
+ return ops->releasedir_prefilter(meta, args);
+ return BPF_FUSE_CONTINUE;
+}
+
+static int fuse_releasedir_postfilter(struct fuse_ops *ops, struct bpf_fuse_meta_info *meta,
+ struct fuse_release_in *args)
+{
+ if (ops->releasedir_postfilter)
+ return ops->releasedir_postfilter(meta, args);
+ return BPF_FUSE_CONTINUE;
+}
+
static int fuse_releasedir_initialize_out(struct bpf_fuse_args *fa,
struct fuse_release_in *fri,
struct inode *inode, struct file *file)
@@ -677,6 +817,7 @@ int fuse_bpf_release(int *out, struct inode *inode, struct file *file)
{
return bpf_fuse_backing(inode, struct fuse_release_in, out,
fuse_release_initialize_in, fuse_release_initialize_out,
+ fuse_release_prefilter, fuse_release_postfilter,
fuse_release_backing, fuse_release_finalize,
inode, file);
}
@@ -685,6 +826,7 @@ int fuse_bpf_releasedir(int *out, struct inode *inode, struct file *file)
{
return bpf_fuse_backing(inode, struct fuse_release_in, out,
fuse_releasedir_initialize_in, fuse_releasedir_initialize_out,
+ fuse_releasedir_prefilter, fuse_releasedir_postfilter,
fuse_release_backing, fuse_release_finalize, inode, file);
}

@@ -717,6 +859,22 @@ static int fuse_flush_initialize_out(struct bpf_fuse_args *fa, struct fuse_flush
return 0;
}

+static int fuse_flush_prefilter(struct fuse_ops *ops, struct bpf_fuse_meta_info *meta,
+ struct fuse_flush_in *args)
+{
+ if (ops->flush_prefilter)
+ return ops->flush_prefilter(meta, args);
+ return BPF_FUSE_CONTINUE;
+}
+
+static int fuse_flush_postfilter(struct fuse_ops *ops, struct bpf_fuse_meta_info *meta,
+ struct fuse_flush_in *args)
+{
+ if (ops->flush_postfilter)
+ return ops->flush_postfilter(meta, args);
+ return BPF_FUSE_CONTINUE;
+}
+
static int fuse_flush_backing(struct bpf_fuse_args *fa, int *out, struct file *file, fl_owner_t id)
{
struct fuse_file *fuse_file = file->private_data;
@@ -737,6 +895,7 @@ int fuse_bpf_flush(int *out, struct inode *inode, struct file *file, fl_owner_t
{
return bpf_fuse_backing(inode, struct fuse_flush_in, out,
fuse_flush_initialize_in, fuse_flush_initialize_out,
+ fuse_flush_prefilter, fuse_flush_postfilter,
fuse_flush_backing, fuse_flush_finalize,
file, id);
}
@@ -780,6 +939,22 @@ static int fuse_lseek_initialize_out(struct bpf_fuse_args *fa, struct fuse_lseek
return 0;
}

+static int fuse_lseek_prefilter(struct fuse_ops *ops, struct bpf_fuse_meta_info *meta,
+ struct fuse_lseek_args *args)
+{
+ if (ops->lseek_prefilter)
+ return ops->lseek_prefilter(meta, &args->in);
+ return BPF_FUSE_CONTINUE;
+}
+
+static int fuse_lseek_postfilter(struct fuse_ops *ops, struct bpf_fuse_meta_info *meta,
+ struct fuse_lseek_args *args)
+{
+ if (ops->lseek_postfilter)
+ return ops->lseek_postfilter(meta, &args->in, &args->out);
+ return BPF_FUSE_CONTINUE;
+}
+
static int fuse_lseek_backing(struct bpf_fuse_args *fa, loff_t *out,
struct file *file, loff_t offset, int whence)
{
@@ -826,6 +1001,7 @@ int fuse_bpf_lseek(loff_t *out, struct inode *inode, struct file *file, loff_t o
{
return bpf_fuse_backing(inode, struct fuse_lseek_args, out,
fuse_lseek_initialize_in, fuse_lseek_initialize_out,
+ fuse_lseek_prefilter, fuse_lseek_postfilter,
fuse_lseek_backing, fuse_lseek_finalize,
file, offset, whence);
}
@@ -878,6 +1054,22 @@ static int fuse_copy_file_range_initialize_out(struct bpf_fuse_args *fa,
return 0;
}

+static int fuse_copy_file_range_prefilter(struct fuse_ops *ops, struct bpf_fuse_meta_info *meta,
+ struct fuse_copy_file_range_args *args)
+{
+ if (ops->copy_file_range_prefilter)
+ return ops->copy_file_range_prefilter(meta, &args->in);
+ return BPF_FUSE_CONTINUE;
+}
+
+static int fuse_copy_file_range_postfilter(struct fuse_ops *ops, struct bpf_fuse_meta_info *meta,
+ struct fuse_copy_file_range_args *args)
+{
+ if (ops->copy_file_range_postfilter)
+ return ops->copy_file_range_postfilter(meta, &args->in, &args->out);
+ return BPF_FUSE_CONTINUE;
+}
+
static int fuse_copy_file_range_backing(struct bpf_fuse_args *fa, ssize_t *out, struct file *file_in,
loff_t pos_in, struct file *file_out, loff_t pos_out, size_t len,
unsigned int flags)
@@ -912,6 +1104,8 @@ int fuse_bpf_copy_file_range(ssize_t *out, struct inode *inode, struct file *fil
return bpf_fuse_backing(inode, struct fuse_copy_file_range_args, out,
fuse_copy_file_range_initialize_in,
fuse_copy_file_range_initialize_out,
+ fuse_copy_file_range_prefilter,
+ fuse_copy_file_range_postfilter,
fuse_copy_file_range_backing,
fuse_copy_file_range_finalize,
file_in, pos_in, file_out, pos_out, len, flags);
@@ -947,6 +1141,22 @@ static int fuse_fsync_initialize_out(struct bpf_fuse_args *fa, struct fuse_fsync
return 0;
}

+static int fuse_fsync_prefilter(struct fuse_ops *ops, struct bpf_fuse_meta_info *meta,
+ struct fuse_fsync_in *in)
+{
+ if (ops->fsync_prefilter)
+ return ops->fsync_prefilter(meta, in);
+ return BPF_FUSE_CONTINUE;
+}
+
+static int fuse_fsync_postfilter(struct fuse_ops *ops, struct bpf_fuse_meta_info *meta,
+ struct fuse_fsync_in *in)
+{
+ if (ops->fsync_postfilter)
+ return ops->fsync_postfilter(meta, in);
+ return BPF_FUSE_CONTINUE;
+}
+
static int fuse_fsync_backing(struct bpf_fuse_args *fa, int *out,
struct file *file, loff_t start, loff_t end, int datasync)
{
@@ -969,6 +1179,7 @@ int fuse_bpf_fsync(int *out, struct inode *inode, struct file *file, loff_t star
{
return bpf_fuse_backing(inode, struct fuse_fsync_in, out,
fuse_fsync_initialize_in, fuse_fsync_initialize_out,
+ fuse_fsync_prefilter, fuse_fsync_postfilter,
fuse_fsync_backing, fuse_fsync_finalize,
file, start, end, datasync);
}
@@ -1003,10 +1214,27 @@ static int fuse_dir_fsync_initialize_out(struct bpf_fuse_args *fa, struct fuse_f
return 0;
}

+static int fuse_dir_fsync_prefilter(struct fuse_ops *ops, struct bpf_fuse_meta_info *meta,
+ struct fuse_fsync_in *in)
+{
+ if (ops->dir_fsync_prefilter)
+ return ops->fsync_prefilter(meta, in);
+ return BPF_FUSE_CONTINUE;
+}
+
+static int fuse_dir_fsync_postfilter(struct fuse_ops *ops, struct bpf_fuse_meta_info *meta,
+ struct fuse_fsync_in *in)
+{
+ if (ops->dir_fsync_postfilter)
+ return ops->dir_fsync_postfilter(meta, in);
+ return BPF_FUSE_CONTINUE;
+}
+
int fuse_bpf_dir_fsync(int *out, struct inode *inode, struct file *file, loff_t start, loff_t end, int datasync)
{
return bpf_fuse_backing(inode, struct fuse_fsync_in, out,
fuse_dir_fsync_initialize_in, fuse_dir_fsync_initialize_out,
+ fuse_dir_fsync_prefilter, fuse_dir_fsync_postfilter,
fuse_fsync_backing, fuse_fsync_finalize,
file, start, end, datasync);
}
@@ -1076,6 +1304,22 @@ static int fuse_getxattr_initialize_out(struct bpf_fuse_args *fa,
return 0;
}

+static int fuse_getxattr_prefilter(struct fuse_ops *ops, struct bpf_fuse_meta_info *meta,
+ struct fuse_getxattr_args *args)
+{
+ if (ops->getxattr_prefilter)
+ return ops->getxattr_prefilter(meta, &args->in, &args->name);
+ return BPF_FUSE_CONTINUE;
+}
+
+static int fuse_getxattr_postfilter(struct fuse_ops *ops, struct bpf_fuse_meta_info *meta,
+ struct fuse_getxattr_args *args)
+{
+ if (ops->getxattr_postfilter)
+ return ops->getxattr_postfilter(meta, &args->in, &args->name, &args->value, &args->out);
+ return BPF_FUSE_CONTINUE;
+}
+
static int fuse_getxattr_backing(struct bpf_fuse_args *fa, int *out,
struct dentry *dentry, const char *name, void *value,
size_t size)
@@ -1121,6 +1365,7 @@ int fuse_bpf_getxattr(int *out, struct inode *inode, struct dentry *dentry, cons
{
return bpf_fuse_backing(inode, struct fuse_getxattr_args, out,
fuse_getxattr_initialize_in, fuse_getxattr_initialize_out,
+ fuse_getxattr_prefilter, fuse_getxattr_postfilter,
fuse_getxattr_backing, fuse_getxattr_finalize,
dentry, name, value, size);
}
@@ -1173,6 +1418,22 @@ static int fuse_listxattr_initialize_out(struct bpf_fuse_args *fa,
return 0;
}

+static int fuse_listxattr_prefilter(struct fuse_ops *ops, struct bpf_fuse_meta_info *meta,
+ struct fuse_getxattr_args *args)
+{
+ if (ops->listxattr_prefilter)
+ return ops->listxattr_prefilter(meta, &args->in);
+ return BPF_FUSE_CONTINUE;
+}
+
+static int fuse_listxattr_postfilter(struct fuse_ops *ops, struct bpf_fuse_meta_info *meta,
+ struct fuse_getxattr_args *args)
+{
+ if (ops->listxattr_postfilter)
+ return ops->listxattr_postfilter(meta, &args->in, &args->value, &args->out);
+ return BPF_FUSE_CONTINUE;
+}
+
static int fuse_listxattr_backing(struct bpf_fuse_args *fa, ssize_t *out, struct dentry *dentry,
char *list, size_t size)
{
@@ -1212,6 +1473,7 @@ int fuse_bpf_listxattr(ssize_t *out, struct inode *inode, struct dentry *dentry,
{
return bpf_fuse_backing(inode, struct fuse_getxattr_args, out,
fuse_listxattr_initialize_in, fuse_listxattr_initialize_out,
+ fuse_listxattr_prefilter, fuse_listxattr_postfilter,
fuse_listxattr_backing, fuse_listxattr_finalize,
dentry, list, size);
}
@@ -1277,6 +1539,22 @@ static int fuse_setxattr_initialize_out(struct bpf_fuse_args *fa,
return 0;
}

+static int fuse_setxattr_prefilter(struct fuse_ops *ops, struct bpf_fuse_meta_info *meta,
+ struct fuse_setxattr_args *args)
+{
+ if (ops->setxattr_prefilter)
+ return ops->setxattr_prefilter(meta, &args->in, &args->name, &args->value);
+ return BPF_FUSE_CONTINUE;
+}
+
+static int fuse_setxattr_postfilter(struct fuse_ops *ops, struct bpf_fuse_meta_info *meta,
+ struct fuse_setxattr_args *args)
+{
+ if (ops->setxattr_postfilter)
+ return ops->setxattr_postfilter(meta, &args->in, &args->name, &args->value);
+ return BPF_FUSE_CONTINUE;
+}
+
static int fuse_setxattr_backing(struct bpf_fuse_args *fa, int *out, struct dentry *dentry,
const char *name, const void *value, size_t size,
int flags)
@@ -1300,6 +1578,7 @@ int fuse_bpf_setxattr(int *out, struct inode *inode, struct dentry *dentry,
{
return bpf_fuse_backing(inode, struct fuse_setxattr_args, out,
fuse_setxattr_initialize_in, fuse_setxattr_initialize_out,
+ fuse_setxattr_prefilter, fuse_setxattr_postfilter,
fuse_setxattr_backing, fuse_setxattr_finalize,
dentry, name, value, size, flags);
}
@@ -1336,6 +1615,22 @@ static int fuse_removexattr_initialize_out(struct bpf_fuse_args *fa,
return 0;
}

+static int fuse_removexattr_prefilter(struct fuse_ops *ops, struct bpf_fuse_meta_info *meta,
+ struct fuse_buffer *in)
+{
+ if (ops->removexattr_prefilter)
+ return ops->removexattr_prefilter(meta, in);
+ return BPF_FUSE_CONTINUE;
+}
+
+static int fuse_removexattr_postfilter(struct fuse_ops *ops, struct bpf_fuse_meta_info *meta,
+ struct fuse_buffer *in)
+{
+ if (ops->removexattr_postfilter)
+ return ops->removexattr_postfilter(meta, in);
+ return BPF_FUSE_CONTINUE;
+}
+
static int fuse_removexattr_backing(struct bpf_fuse_args *fa, int *out,
struct dentry *dentry, const char *name)
{
@@ -1356,6 +1651,7 @@ int fuse_bpf_removexattr(int *out, struct inode *inode, struct dentry *dentry, c
{
return bpf_fuse_backing(inode, struct fuse_buffer, out,
fuse_removexattr_initialize_in, fuse_removexattr_initialize_out,
+ fuse_removexattr_prefilter, fuse_removexattr_postfilter,
fuse_removexattr_backing, fuse_removexattr_finalize,
dentry, name);
}
@@ -1446,6 +1742,22 @@ static int fuse_file_read_iter_initialize_out(struct bpf_fuse_args *fa, struct f
return 0;
}

+static int fuse_file_read_iter_prefilter(struct fuse_ops *ops, struct bpf_fuse_meta_info *meta,
+ struct fuse_file_read_iter_args *args)
+{
+ if (ops->read_iter_prefilter)
+ return ops->read_iter_prefilter(meta, &args->in);
+ return BPF_FUSE_CONTINUE;
+}
+
+static int fuse_file_read_iter_postfilter(struct fuse_ops *ops, struct bpf_fuse_meta_info *meta,
+ struct fuse_file_read_iter_args *args)
+{
+ if (ops->read_iter_postfilter)
+ return ops->read_iter_postfilter(meta, &args->in, &args->out);
+ return BPF_FUSE_CONTINUE;
+}
+
static int fuse_file_read_iter_backing(struct bpf_fuse_args *fa, ssize_t *out,
struct kiocb *iocb, struct iov_iter *to)
{
@@ -1513,6 +1825,8 @@ int fuse_bpf_file_read_iter(ssize_t *out, struct inode *inode, struct kiocb *ioc
return bpf_fuse_backing(inode, struct fuse_file_read_iter_args, out,
fuse_file_read_iter_initialize_in,
fuse_file_read_iter_initialize_out,
+ fuse_file_read_iter_prefilter,
+ fuse_file_read_iter_postfilter,
fuse_file_read_iter_backing,
fuse_file_read_iter_finalize,
iocb, to);
@@ -1562,6 +1876,22 @@ static int fuse_file_write_iter_initialize_out(struct bpf_fuse_args *fa,
return 0;
}

+static int fuse_write_iter_prefilter(struct fuse_ops *ops, struct bpf_fuse_meta_info *meta,
+ struct fuse_file_write_iter_args *args)
+{
+ if (ops->write_iter_prefilter)
+ return ops->write_iter_prefilter(meta, &args->in);
+ return BPF_FUSE_CONTINUE;
+}
+
+static int fuse_write_iter_postfilter(struct fuse_ops *ops, struct bpf_fuse_meta_info *meta,
+ struct fuse_file_write_iter_args *args)
+{
+ if (ops->write_iter_postfilter)
+ return ops->write_iter_postfilter(meta, &args->in, &args->out);
+ return BPF_FUSE_CONTINUE;
+}
+
static int fuse_file_write_iter_backing(struct bpf_fuse_args *fa, ssize_t *out,
struct kiocb *iocb, struct iov_iter *from)
{
@@ -1626,6 +1956,8 @@ int fuse_bpf_file_write_iter(ssize_t *out, struct inode *inode, struct kiocb *io
return bpf_fuse_backing(inode, struct fuse_file_write_iter_args, out,
fuse_file_write_iter_initialize_in,
fuse_file_write_iter_initialize_out,
+ fuse_write_iter_prefilter,
+ fuse_write_iter_postfilter,
fuse_file_write_iter_backing,
fuse_file_write_iter_finalize,
iocb, from);
@@ -1701,6 +2033,22 @@ static int fuse_file_fallocate_initialize_out(struct bpf_fuse_args *fa,
return 0;
}

+static int fuse_file_fallocate_prefilter(struct fuse_ops *ops, struct bpf_fuse_meta_info *meta,
+ struct fuse_fallocate_in *in)
+{
+ if (ops->file_fallocate_prefilter)
+ return ops->file_fallocate_prefilter(meta, in);
+ return BPF_FUSE_CONTINUE;
+}
+
+static int fuse_file_fallocate_postfilter(struct fuse_ops *ops, struct bpf_fuse_meta_info *meta,
+ struct fuse_fallocate_in *in)
+{
+ if (ops->file_fallocate_postfilter)
+ return ops->file_fallocate_postfilter(meta, in);
+ return BPF_FUSE_CONTINUE;
+}
+
static int fuse_file_fallocate_backing(struct bpf_fuse_args *fa, int *out,
struct file *file, int mode, loff_t offset, loff_t length)
{
@@ -1723,6 +2071,8 @@ int fuse_bpf_file_fallocate(int *out, struct inode *inode, struct file *file, in
return bpf_fuse_backing(inode, struct fuse_fallocate_in, out,
fuse_file_fallocate_initialize_in,
fuse_file_fallocate_initialize_out,
+ fuse_file_fallocate_prefilter,
+ fuse_file_fallocate_postfilter,
fuse_file_fallocate_backing,
fuse_file_fallocate_finalize,
file, mode, offset, length);
@@ -1790,6 +2140,22 @@ static int fuse_lookup_initialize_out(struct bpf_fuse_args *fa, struct fuse_look
return 0;
}

+static int fuse_lookup_prefilter(struct fuse_ops *ops, struct bpf_fuse_meta_info *meta,
+ struct fuse_lookup_args *args)
+{
+ if (ops->lookup_prefilter)
+ return ops->lookup_prefilter(meta, &args->name);
+ return BPF_FUSE_CONTINUE;
+}
+
+static int fuse_lookup_postfilter(struct fuse_ops *ops, struct bpf_fuse_meta_info *meta,
+ struct fuse_lookup_args *args)
+{
+ if (ops->lookup_postfilter)
+ return ops->lookup_postfilter(meta, &args->name, &args->out, &args->bpf_entries);
+ return BPF_FUSE_CONTINUE;
+}
+
static int fuse_lookup_backing(struct bpf_fuse_args *fa, struct dentry **out, struct inode *dir,
struct dentry *entry, unsigned int flags)
{
@@ -1968,6 +2334,7 @@ int fuse_bpf_lookup(struct dentry **out, struct inode *dir, struct dentry *entry
{
return bpf_fuse_backing(dir, struct fuse_lookup_args, out,
fuse_lookup_initialize_in, fuse_lookup_initialize_out,
+ fuse_lookup_prefilter, fuse_lookup_postfilter,
fuse_lookup_backing, fuse_lookup_finalize,
dir, entry, flags);
}
@@ -2034,6 +2401,22 @@ static int fuse_mknod_initialize_out(struct bpf_fuse_args *fa, struct fuse_mknod
return 0;
}

+static int fuse_mknod_prefilter(struct fuse_ops *ops, struct bpf_fuse_meta_info *meta,
+ struct fuse_mknod_args *args)
+{
+ if (ops->mknod_prefilter)
+ return ops->mknod_prefilter(meta, &args->in, &args->name);
+ return BPF_FUSE_CONTINUE;
+}
+
+static int fuse_mknod_postfilter(struct fuse_ops *ops, struct bpf_fuse_meta_info *meta,
+ struct fuse_mknod_args *args)
+{
+ if (ops->mknod_postfilter)
+ return ops->mknod_postfilter(meta, &args->in, &args->name);
+ return BPF_FUSE_CONTINUE;
+}
+
static int fuse_mknod_backing(struct bpf_fuse_args *fa, int *out,
struct inode *dir, struct dentry *entry, umode_t mode, dev_t rdev)
{
@@ -2086,6 +2469,7 @@ int fuse_bpf_mknod(int *out, struct inode *dir, struct dentry *entry, umode_t mo
{
return bpf_fuse_backing(dir, struct fuse_mknod_args, out,
fuse_mknod_initialize_in, fuse_mknod_initialize_out,
+ fuse_mknod_prefilter, fuse_mknod_postfilter,
fuse_mknod_backing, fuse_mknod_finalize,
dir, entry, mode, rdev);
}
@@ -2187,10 +2571,27 @@ static int fuse_mkdir_finalize(struct bpf_fuse_args *fa, int *out,
return 0;
}

+static int fuse_mkdir_prefilter(struct fuse_ops *ops, struct bpf_fuse_meta_info *meta,
+ struct fuse_mkdir_args *args)
+{
+ if (ops->mkdir_prefilter)
+ return ops->mkdir_prefilter(meta, &args->in, &args->name);
+ return BPF_FUSE_CONTINUE;
+}
+
+static int fuse_mkdir_postfilter(struct fuse_ops *ops, struct bpf_fuse_meta_info *meta,
+ struct fuse_mkdir_args *args)
+{
+ if (ops->mkdir_prefilter)
+ return ops->mkdir_postfilter(meta, &args->in, &args->name);
+ return BPF_FUSE_CONTINUE;
+}
+
int fuse_bpf_mkdir(int *out, struct inode *dir, struct dentry *entry, umode_t mode)
{
return bpf_fuse_backing(dir, struct fuse_mkdir_args, out,
fuse_mkdir_initialize_in, fuse_mkdir_initialize_out,
+ fuse_mkdir_prefilter, fuse_mkdir_postfilter,
fuse_mkdir_backing, fuse_mkdir_finalize,
dir, entry, mode);
}
@@ -2224,6 +2625,22 @@ static int fuse_rmdir_initialize_out(struct bpf_fuse_args *fa, struct fuse_buffe
return 0;
}

+static int fuse_rmdir_prefilter(struct fuse_ops *ops, struct bpf_fuse_meta_info *meta,
+ struct fuse_buffer *name)
+{
+ if (ops->rmdir_prefilter)
+ return ops->rmdir_prefilter(meta, name);
+ return BPF_FUSE_CONTINUE;
+}
+
+static int fuse_rmdir_postfilter(struct fuse_ops *ops, struct bpf_fuse_meta_info *meta,
+ struct fuse_buffer *name)
+{
+ if (ops->rmdir_postfilter)
+ return ops->rmdir_postfilter(meta, name);
+ return BPF_FUSE_CONTINUE;
+}
+
static int fuse_rmdir_backing(struct bpf_fuse_args *fa, int *out,
struct inode *dir, struct dentry *entry)
{
@@ -2258,6 +2675,7 @@ int fuse_bpf_rmdir(int *out, struct inode *dir, struct dentry *entry)
{
return bpf_fuse_backing(dir, struct fuse_buffer, out,
fuse_rmdir_initialize_in, fuse_rmdir_initialize_out,
+ fuse_rmdir_prefilter, fuse_rmdir_postfilter,
fuse_rmdir_backing, fuse_rmdir_finalize,
dir, entry);
}
@@ -2400,6 +2818,22 @@ static int fuse_rename2_initialize_out(struct bpf_fuse_args *fa, struct fuse_ren
return 0;
}

+static int fuse_rename2_prefilter(struct fuse_ops *ops, struct bpf_fuse_meta_info *meta,
+ struct fuse_rename2_args *args)
+{
+ if (ops->rename2_prefilter)
+ return ops->rename2_prefilter(meta, &args->in, &args->old_name, &args->new_name);
+ return BPF_FUSE_CONTINUE;
+}
+
+static int fuse_rename2_postfilter(struct fuse_ops *ops, struct bpf_fuse_meta_info *meta,
+ struct fuse_rename2_args *args)
+{
+ if (ops->rename2_postfilter)
+ return ops->rename2_postfilter(meta, &args->in, &args->old_name, &args->new_name);
+ return BPF_FUSE_CONTINUE;
+}
+
static int fuse_rename2_backing(struct bpf_fuse_args *fa, int *out,
struct inode *olddir, struct dentry *oldent,
struct inode *newdir, struct dentry *newent,
@@ -2427,6 +2861,7 @@ int fuse_bpf_rename2(int *out, struct inode *olddir, struct dentry *oldent,
{
return bpf_fuse_backing(olddir, struct fuse_rename2_args, out,
fuse_rename2_initialize_in, fuse_rename2_initialize_out,
+ fuse_rename2_prefilter, fuse_rename2_postfilter,
fuse_rename2_backing, fuse_rename2_finalize,
olddir, oldent, newdir, newent, flags);
}
@@ -2487,6 +2922,22 @@ static int fuse_rename_initialize_out(struct bpf_fuse_args *fa, struct fuse_rena
return 0;
}

+static int fuse_rename_prefilter(struct fuse_ops *ops, struct bpf_fuse_meta_info *meta,
+ struct fuse_rename_args *args)
+{
+ if (ops->rename_prefilter)
+ return ops->rename_prefilter(meta, &args->in, &args->old_name, &args->new_name);
+ return BPF_FUSE_CONTINUE;
+}
+
+static int fuse_rename_postfilter(struct fuse_ops *ops, struct bpf_fuse_meta_info *meta,
+ struct fuse_rename_args *args)
+{
+ if (ops->rename_postfilter)
+ return ops->rename_postfilter(meta, &args->in, &args->old_name, &args->new_name);
+ return BPF_FUSE_CONTINUE;
+}
+
static int fuse_rename_backing(struct bpf_fuse_args *fa, int *out,
struct inode *olddir, struct dentry *oldent,
struct inode *newdir, struct dentry *newent)
@@ -2508,6 +2959,7 @@ int fuse_bpf_rename(int *out, struct inode *olddir, struct dentry *oldent,
{
return bpf_fuse_backing(olddir, struct fuse_rename_args, out,
fuse_rename_initialize_in, fuse_rename_initialize_out,
+ fuse_rename_prefilter, fuse_rename_postfilter,
fuse_rename_backing, fuse_rename_finalize,
olddir, oldent, newdir, newent);
}
@@ -2541,6 +2993,22 @@ static int fuse_unlink_initialize_out(struct bpf_fuse_args *fa, struct fuse_buff
return 0;
}

+static int fuse_unlink_prefilter(struct fuse_ops *ops, struct bpf_fuse_meta_info *meta,
+ struct fuse_buffer *name)
+{
+ if (ops->unlink_prefilter)
+ return ops->unlink_prefilter(meta, name);
+ return BPF_FUSE_CONTINUE;
+}
+
+static int fuse_unlink_postfilter(struct fuse_ops *ops, struct bpf_fuse_meta_info *meta,
+ struct fuse_buffer *name)
+{
+ if (ops->unlink_postfilter)
+ return ops->unlink_postfilter(meta, name);
+ return BPF_FUSE_CONTINUE;
+}
+
static int fuse_unlink_backing(struct bpf_fuse_args *fa, int *out, struct inode *dir, struct dentry *entry)
{
struct path backing_path;
@@ -2577,6 +3045,7 @@ int fuse_bpf_unlink(int *out, struct inode *dir, struct dentry *entry)
{
return bpf_fuse_backing(dir, struct fuse_buffer, out,
fuse_unlink_initialize_in, fuse_unlink_initialize_out,
+ fuse_unlink_prefilter, fuse_unlink_postfilter,
fuse_unlink_backing, fuse_unlink_finalize,
dir, entry);
}
@@ -2629,6 +3098,23 @@ static int fuse_link_initialize_out(struct bpf_fuse_args *fa, struct fuse_link_a
return 0;
}

+static int fuse_link_prefilter(struct fuse_ops *ops, struct bpf_fuse_meta_info *meta,
+ struct fuse_link_args *args)
+{
+ if (ops->link_prefilter)
+ return ops->link_prefilter(meta, &args->in, &args->name);
+ return BPF_FUSE_CONTINUE;
+}
+
+static int fuse_link_postfilter(struct fuse_ops *ops, struct bpf_fuse_meta_info *meta,
+ struct fuse_link_args *args)
+{
+ if (ops->link_postfilter)
+ return ops->link_postfilter(meta, &args->in, &args->name);
+ return BPF_FUSE_CONTINUE;
+}
+
+
static int fuse_link_backing(struct bpf_fuse_args *fa, int *out, struct dentry *entry,
struct inode *dir, struct dentry *newent)
{
@@ -2696,6 +3182,7 @@ int fuse_bpf_link(int *out, struct inode *inode, struct dentry *entry,
{
return bpf_fuse_backing(inode, struct fuse_link_args, out,
fuse_link_initialize_in, fuse_link_initialize_out,
+ fuse_link_prefilter, fuse_link_postfilter,
fuse_link_backing, fuse_link_finalize,
entry, newdir, newent);
}
@@ -2744,6 +3231,22 @@ static int fuse_getattr_initialize_out(struct bpf_fuse_args *fa, struct fuse_get
return 0;
}

+static int fuse_getattr_prefilter(struct fuse_ops *ops, struct bpf_fuse_meta_info *meta,
+ struct fuse_getattr_args *args)
+{
+ if (ops->getattr_prefilter)
+ return ops->getattr_prefilter(meta, &args->in);
+ return BPF_FUSE_CONTINUE;
+}
+
+static int fuse_getattr_postfilter(struct fuse_ops *ops, struct bpf_fuse_meta_info *meta,
+ struct fuse_getattr_args *args)
+{
+ if (ops->getattr_postfilter)
+ return ops->getattr_postfilter(meta, &args->in, &args->out);
+ return BPF_FUSE_CONTINUE;
+}
+
static int fuse_getattr_backing(struct bpf_fuse_args *fa, int *out,
const struct dentry *entry, struct kstat *stat,
u32 request_mask, unsigned int flags)
@@ -2804,6 +3307,7 @@ int fuse_bpf_getattr(int *out, struct inode *inode, const struct dentry *entry,
{
return bpf_fuse_backing(inode, struct fuse_getattr_args, out,
fuse_getattr_initialize_in, fuse_getattr_initialize_out,
+ fuse_getattr_prefilter, fuse_getattr_postfilter,
fuse_getattr_backing, fuse_getattr_finalize,
entry, stat, request_mask, flags);
}
@@ -2883,6 +3387,22 @@ static int fuse_setattr_initialize_out(struct bpf_fuse_args *fa, struct fuse_set
return 0;
}

+static int fuse_setattr_prefilter(struct fuse_ops *ops, struct bpf_fuse_meta_info *meta,
+ struct fuse_setattr_args *args)
+{
+ if (ops->setattr_prefilter)
+ return ops->setattr_prefilter(meta, &args->in);
+ return BPF_FUSE_CONTINUE;
+}
+
+static int fuse_setattr_postfilter(struct fuse_ops *ops, struct bpf_fuse_meta_info *meta,
+ struct fuse_setattr_args *args)
+{
+ if (ops->setattr_postfilter)
+ return ops->setattr_postfilter(meta, &args->in, &args->out);
+ return BPF_FUSE_CONTINUE;
+}
+
static int fuse_setattr_backing(struct bpf_fuse_args *fa, int *out,
struct dentry *dentry, struct iattr *attr, struct file *file)
{
@@ -2920,6 +3440,7 @@ int fuse_bpf_setattr(int *out, struct inode *inode, struct dentry *dentry, struc
{
return bpf_fuse_backing(inode, struct fuse_setattr_args, out,
fuse_setattr_initialize_in, fuse_setattr_initialize_out,
+ fuse_setattr_prefilter, fuse_setattr_postfilter,
fuse_setattr_backing, fuse_setattr_finalize,
dentry, attr, file);
}
@@ -2949,6 +3470,22 @@ static int fuse_statfs_initialize_out(struct bpf_fuse_args *fa, struct fuse_stat
return 0;
}

+static int fuse_statfs_prefilter(struct fuse_ops *ops, struct bpf_fuse_meta_info *meta,
+ struct fuse_statfs_out *out)
+{
+ if (ops->statfs_prefilter)
+ return ops->statfs_prefilter(meta);
+ return BPF_FUSE_CONTINUE;
+}
+
+static int fuse_statfs_postfilter(struct fuse_ops *ops, struct bpf_fuse_meta_info *meta,
+ struct fuse_statfs_out *out)
+{
+ if (ops->statfs_postfilter)
+ return ops->statfs_postfilter(meta, out);
+ return BPF_FUSE_CONTINUE;
+}
+
static int fuse_statfs_backing(struct bpf_fuse_args *fa, int *out,
struct dentry *dentry, struct kstatfs *buf)
{
@@ -2984,6 +3521,7 @@ int fuse_bpf_statfs(int *out, struct inode *inode, struct dentry *dentry, struct
{
return bpf_fuse_backing(dentry->d_inode, struct fuse_statfs_out, out,
fuse_statfs_initialize_in, fuse_statfs_initialize_out,
+ fuse_statfs_prefilter, fuse_statfs_postfilter,
fuse_statfs_backing, fuse_statfs_finalize,
dentry, buf);
}
@@ -3053,6 +3591,22 @@ static int fuse_get_link_initialize_out(struct bpf_fuse_args *fa, struct fuse_ge
return 0;
}

+static int fuse_get_link_prefilter(struct fuse_ops *ops, struct bpf_fuse_meta_info *meta,
+ struct fuse_get_link_args *args)
+{
+ if (ops->get_link_prefilter)
+ return ops->get_link_prefilter(meta, &args->name);
+ return BPF_FUSE_CONTINUE;
+}
+
+static int fuse_get_link_postfilter(struct fuse_ops *ops, struct bpf_fuse_meta_info *meta,
+ struct fuse_get_link_args *args)
+{
+ if (ops->get_link_postfilter)
+ return ops->get_link_postfilter(meta, &args->name);
+ return BPF_FUSE_CONTINUE;
+}
+
static int fuse_get_link_backing(struct bpf_fuse_args *fa, const char **out,
struct inode *inode, struct dentry *dentry,
struct delayed_call *callback)
@@ -3092,6 +3646,7 @@ int fuse_bpf_get_link(const char **out, struct inode *inode, struct dentry *dent
{
return bpf_fuse_backing(inode, struct fuse_get_link_args, out,
fuse_get_link_initialize_in, fuse_get_link_initialize_out,
+ fuse_get_link_prefilter, fuse_get_link_postfilter,
fuse_get_link_backing, fuse_get_link_finalize,
inode, dentry, callback);
}
@@ -3142,6 +3697,22 @@ static int fuse_symlink_initialize_out(struct bpf_fuse_args *fa, struct fuse_sym
return 0;
}

+static int fuse_symlink_prefilter(struct fuse_ops *ops, struct bpf_fuse_meta_info *meta,
+ struct fuse_symlink_args *args)
+{
+ if (ops->symlink_prefilter)
+ return ops->symlink_prefilter(meta, &args->name, &args->path);
+ return BPF_FUSE_CONTINUE;
+}
+
+static int fuse_symlink_postfilter(struct fuse_ops *ops, struct bpf_fuse_meta_info *meta,
+ struct fuse_symlink_args *args)
+{
+ if (ops->symlink_postfilter)
+ return ops->symlink_postfilter(meta, &args->name, &args->path);
+ return BPF_FUSE_CONTINUE;
+}
+
static int fuse_symlink_backing(struct bpf_fuse_args *fa, int *out,
struct inode *dir, struct dentry *entry, const char *link, int len)
{
@@ -3192,6 +3763,7 @@ int fuse_bpf_symlink(int *out, struct inode *dir, struct dentry *entry, const c
{
return bpf_fuse_backing(dir, struct fuse_symlink_args, out,
fuse_symlink_initialize_in, fuse_symlink_initialize_out,
+ fuse_symlink_prefilter, fuse_symlink_postfilter,
fuse_symlink_backing, fuse_symlink_finalize,
dir, entry, link, len);
}
@@ -3265,6 +3837,22 @@ static int fuse_readdir_initialize_out(struct bpf_fuse_args *fa, struct fuse_rea
return 0;
}

+static int fuse_readdir_prefilter(struct fuse_ops *ops, struct bpf_fuse_meta_info *meta,
+ struct fuse_read_args *args)
+{
+ if (ops->readdir_prefilter)
+ return ops->readdir_prefilter(meta, &args->in);
+ return BPF_FUSE_CONTINUE;
+}
+
+static int fuse_readdir_postfilter(struct fuse_ops *ops, struct bpf_fuse_meta_info *meta,
+ struct fuse_read_args *args)
+{
+ if (ops->readdir_postfilter)
+ return ops->readdir_postfilter(meta, &args->in, &args->out, &args->buffer);
+ return BPF_FUSE_CONTINUE;
+}
+
struct fusebpf_ctx {
struct dir_context ctx;
u8 *addr;
@@ -3380,6 +3968,7 @@ int fuse_bpf_readdir(int *out, struct inode *inode, struct file *file, struct di
again:
ret = bpf_fuse_backing(inode, struct fuse_read_args, out,
fuse_readdir_initialize_in, fuse_readdir_initialize_out,
+ fuse_readdir_prefilter, fuse_readdir_postfilter,
fuse_readdir_backing, fuse_readdir_finalize,
file, ctx, &force_again, &allow_force, is_continued);
if (force_again && *out >= 0) {
@@ -3416,6 +4005,22 @@ static int fuse_access_initialize_out(struct bpf_fuse_args *fa, struct fuse_acce
return 0;
}

+static int fuse_access_prefilter(struct fuse_ops *ops, struct bpf_fuse_meta_info *meta,
+ struct fuse_access_in *in)
+{
+ if (ops->access_prefilter)
+ return ops->access_prefilter(meta, in);
+ return BPF_FUSE_CONTINUE;
+}
+
+static int fuse_access_postfilter(struct fuse_ops *ops, struct bpf_fuse_meta_info *meta,
+ struct fuse_access_in *in)
+{
+ if (ops->access_postfilter)
+ return ops->access_postfilter(meta, in);
+ return BPF_FUSE_CONTINUE;
+}
+
static int fuse_access_backing(struct bpf_fuse_args *fa, int *out, struct inode *inode, int mask)
{
struct fuse_inode *fi = get_fuse_inode(inode);
@@ -3434,6 +4039,7 @@ int fuse_bpf_access(int *out, struct inode *inode, int mask)
{
return bpf_fuse_backing(inode, struct fuse_access_in, out,
fuse_access_initialize_in, fuse_access_initialize_out,
+ fuse_access_prefilter, fuse_access_postfilter,
fuse_access_backing, fuse_access_finalize, inode, mask);
}

--
2.40.0.634.g4ca3ef3211-goog

2023-04-18 01:46:57

by Daniel Rosenberg

[permalink] [raw]
Subject: [RFC PATCH v3 33/37] fuse-bpf: Add userspace pre/post filters

This allows fuse-bpf to call out to userspace to handle pre and post
filters. Any of the inputs may be changed by the prefilter, so we must
handle up to 3 outputs. For the postfilter, our inputs include the
output arguments, so we must handle up to 5 inputs.

Additionally, we add an extension for passing the return code of
the backing call to the postfilter, adding one additional possible
output bringing the total to 4.

As long as you don't request both pre-filter and post-filter in
userspace, we will end up doing fewer round trips to userspace.

Signed-off-by: Daniel Rosenberg <[email protected]>
---
fs/fuse/backing.c | 179 ++++++++++++++++++++++++++++++++++++++
fs/fuse/dev.c | 2 +
fs/fuse/dir.c | 6 +-
fs/fuse/fuse_i.h | 33 ++++++-
include/linux/bpf_fuse.h | 1 +
include/uapi/linux/fuse.h | 1 +
6 files changed, 217 insertions(+), 5 deletions(-)

diff --git a/fs/fuse/backing.c b/fs/fuse/backing.c
index 9217e9f83d98..1de302fc91b6 100644
--- a/fs/fuse/backing.c
+++ b/fs/fuse/backing.c
@@ -14,6 +14,163 @@
#include <linux/namei.h>
#include <linux/uio.h>

+static void set_in_args(struct fuse_in_arg *dst, struct bpf_fuse_arg *src)
+{
+ if (src->is_buffer) {
+ struct fuse_buffer *buffer = src->buffer;
+
+ *dst = (struct fuse_in_arg) {
+ .size = buffer->size,
+ .value = buffer->data,
+ };
+ } else {
+ *dst = (struct fuse_in_arg) {
+ .size = src->size,
+ .value = src->value,
+ };
+ }
+}
+
+static void set_out_args(struct fuse_arg *dst, struct bpf_fuse_arg *src)
+{
+ if (src->is_buffer) {
+ struct fuse_buffer *buffer = src->buffer;
+
+ // Userspace out args presents as much space as needed
+ *dst = (struct fuse_arg) {
+ .size = buffer->max_size,
+ .value = buffer->data,
+ };
+ } else {
+ *dst = (struct fuse_arg) {
+ .size = src->size,
+ .value = src->value,
+ };
+ }
+}
+
+static int get_err_in(uint32_t error, struct fuse_in_arg *ext)
+{
+ struct fuse_ext_header *xh;
+ uint32_t *err_in;
+ uint32_t err_in_size = fuse_ext_size(sizeof(*err_in));
+
+ xh = extend_arg(ext, err_in_size);
+ if (!xh)
+ return -ENOMEM;
+ xh->size = err_in_size;
+ xh->type = FUSE_ERROR_IN;
+
+ err_in = (uint32_t *)&xh[1];
+ *err_in = error;
+ return 0;
+}
+
+static int get_filter_ext(struct fuse_args *args)
+{
+ struct fuse_in_arg ext = { .size = 0, .value = NULL };
+ int err = 0;
+
+ if (args->is_filter)
+ err = get_err_in(args->error_in, &ext);
+ if (!err && ext.size) {
+ WARN_ON(args->in_numargs >= ARRAY_SIZE(args->in_args));
+ args->is_ext = true;
+ args->ext_idx = args->in_numargs++;
+ args->in_args[args->ext_idx] = ext;
+ } else {
+ kfree(ext.value);
+ }
+ return err;
+}
+
+static ssize_t fuse_bpf_simple_request(struct fuse_mount *fm, struct bpf_fuse_args *fa,
+ unsigned short in_numargs, unsigned short out_numargs,
+ struct bpf_fuse_arg *out_arg_array, bool add_out_to_in)
+{
+ int i;
+ ssize_t res;
+
+ struct fuse_args args = {
+ .nodeid = fa->info.nodeid,
+ .opcode = fa->info.opcode,
+ .error_in = fa->info.error_in,
+ .in_numargs = in_numargs,
+ .out_numargs = out_numargs,
+ .force = !!(fa->flags & FUSE_BPF_FORCE),
+ .out_argvar = !!(fa->flags & FUSE_BPF_OUT_ARGVAR),
+ .is_lookup = !!(fa->flags & FUSE_BPF_IS_LOOKUP),
+ .is_filter = true,
+ };
+
+ /* All out args must be writeable */
+ for (i = 0; i < out_numargs; ++i) {
+ struct fuse_buffer *buffer;
+
+ if (!out_arg_array[i].is_buffer)
+ continue;
+ buffer = out_arg_array[i].buffer;
+ if (!bpf_fuse_get_writeable(buffer, buffer->max_size, true))
+ return -ENOMEM;
+ }
+
+ /* Set in args */
+ for (i = 0; i < fa->in_numargs; ++i)
+ set_in_args(&args.in_args[i], &fa->in_args[i]);
+ if (add_out_to_in) {
+ for (i = 0; i < fa->out_numargs; ++i) {
+ set_in_args(&args.in_args[fa->in_numargs + i], &fa->out_args[i]);
+ }
+ }
+
+ /* Set out args */
+ for (i = 0; i < out_numargs; ++i)
+ set_out_args(&args.out_args[i], &out_arg_array[i]);
+
+ if (out_arg_array[out_numargs - 1].is_buffer) {
+ struct fuse_buffer *buff = out_arg_array[out_numargs - 1].buffer;
+
+ if (buff->flags & BPF_FUSE_VARIABLE_SIZE)
+ args.out_argvar = true;
+ }
+ if (add_out_to_in) {
+ res = get_filter_ext(&args);
+ if (res)
+ return res;
+ }
+ res = fuse_simple_request(fm, &args);
+
+ /* update used areas of buffers */
+ for (i = 0; i < out_numargs; ++i)
+ if (out_arg_array[i].is_buffer &&
+ (out_arg_array[i].buffer->flags & BPF_FUSE_VARIABLE_SIZE))
+ out_arg_array[i].buffer->size = args.out_args[i].size;
+ fa->ret = args.ret;
+
+ free_ext_value(&args);
+
+ return res;
+}
+
+static ssize_t fuse_prefilter_simple_request(struct fuse_mount *fm, struct bpf_fuse_args *fa)
+{
+ uint32_t out_args = fa->in_numargs;
+
+ // mkdir and company are not permitted to change the name. This should be done at lookup
+ // Thus, these can't be set by the userspace prefilter
+ if (fa->in_args[fa->in_numargs - 1].is_buffer &&
+ (fa->in_args[fa->in_numargs - 1].buffer->flags & BPF_FUSE_IMMUTABLE))
+ out_args--;
+ return fuse_bpf_simple_request(fm, fa, fa->in_numargs, out_args,
+ fa->in_args, false);
+}
+
+static ssize_t fuse_postfilter_simple_request(struct fuse_mount *fm, struct bpf_fuse_args *fa)
+{
+ return fuse_bpf_simple_request(fm, fa, fa->in_numargs + fa->out_numargs, fa->out_numargs,
+ fa->out_args, true);
+}
+
static inline void bpf_fuse_set_in_immutable(struct bpf_fuse_args *fa)
{
int i;
@@ -60,9 +217,11 @@ static inline void bpf_fuse_free_alloced(struct bpf_fuse_args *fa)
({ \
struct fuse_inode *fuse_inode = get_fuse_inode(inode); \
struct fuse_ops *fuse_ops = fuse_inode->bpf_ops; \
+ struct fuse_mount *fm = get_fuse_mount(inode); \
struct bpf_fuse_args fa = { 0 }; \
bool initialized = false; \
bool handled = false; \
+ bool locked; \
ssize_t res; \
int bpf_next; \
io feo = { 0 }; \
@@ -88,6 +247,16 @@ static inline void bpf_fuse_free_alloced(struct bpf_fuse_args *fa)
break; \
} \
\
+ if (bpf_next == BPF_FUSE_USER_PREFILTER) { \
+ locked = fuse_lock_inode(inode); \
+ res = fuse_prefilter_simple_request(fm, &fa); \
+ fuse_unlock_inode(inode, locked); \
+ if (res < 0) { \
+ error = res; \
+ break; \
+ } \
+ bpf_next = fa.ret; \
+ } \
bpf_fuse_set_in_immutable(&fa); \
\
error = initialize_out(&fa, &feo, args); \
@@ -117,6 +286,16 @@ static inline void bpf_fuse_free_alloced(struct bpf_fuse_args *fa)
break; \
} \
\
+ if (!(bpf_next == BPF_FUSE_USER_POSTFILTER)) \
+ break; \
+ \
+ locked = fuse_lock_inode(inode); \
+ res = fuse_postfilter_simple_request(fm, &fa); \
+ fuse_unlock_inode(inode, locked); \
+ if (res < 0) { \
+ error = res; \
+ break; \
+ } \
} while (false); \
\
if (initialized && handled) { \
diff --git a/fs/fuse/dev.c b/fs/fuse/dev.c
index ad7d9d1e6da5..139f40b70228 100644
--- a/fs/fuse/dev.c
+++ b/fs/fuse/dev.c
@@ -521,6 +521,8 @@ ssize_t fuse_simple_request(struct fuse_mount *fm, struct fuse_args *args)
BUG_ON(args->out_numargs == 0);
ret = args->out_args[args->out_numargs - 1].size;
}
+ if (args->is_filter && args->is_ext)
+ args->ret = req->out.h.error;
fuse_put_request(req);

return ret;
diff --git a/fs/fuse/dir.c b/fs/fuse/dir.c
index b7bc8260a537..bea5f1698127 100644
--- a/fs/fuse/dir.c
+++ b/fs/fuse/dir.c
@@ -620,7 +620,7 @@ static int get_security_context(struct dentry *entry, umode_t mode,
return err;
}

-static void *extend_arg(struct fuse_in_arg *buf, u32 bytes)
+void *extend_arg(struct fuse_in_arg *buf, u32 bytes)
{
void *p;
u32 newlen = buf->size + bytes;
@@ -640,7 +640,7 @@ static void *extend_arg(struct fuse_in_arg *buf, u32 bytes)
return p + newlen - bytes;
}

-static u32 fuse_ext_size(size_t size)
+u32 fuse_ext_size(size_t size)
{
return FUSE_REC_ALIGN(sizeof(struct fuse_ext_header) + size);
}
@@ -700,7 +700,7 @@ static int get_create_ext(struct fuse_args *args,
return err;
}

-static void free_ext_value(struct fuse_args *args)
+void free_ext_value(struct fuse_args *args)
{
if (args->is_ext)
kfree(args->in_args[args->ext_idx].value);
diff --git a/fs/fuse/fuse_i.h b/fs/fuse/fuse_i.h
index 15962ab3b381..0504c136632d 100644
--- a/fs/fuse/fuse_i.h
+++ b/fs/fuse/fuse_i.h
@@ -304,6 +304,17 @@ struct fuse_page_desc {
unsigned int offset;
};

+/* To deal with bpf pre and post filters in userspace calls, we must support
+ * passing the inputs and outputs as inputs, and we must have enough space in
+ * outputs to handle all of the inputs. Plus one more for extensions.
+ */
+#define FUSE_EXTENDED_MAX_ARGS_IN (FUSE_MAX_ARGS_IN + FUSE_MAX_ARGS_OUT + 1)
+#if FUSE_MAX_ARGS_IN > FUSE_MAX_ARGS_OUT
+#define FUSE_EXTENDED_MAX_ARGS_OUT FUSE_MAX_ARGS_IN
+#else
+#define FUSE_EXTENDED_MAX_ARGS_OUT FUSE_MAX_ARGS_OUT
+#endif
+
struct fuse_args {
uint64_t nodeid;
uint32_t opcode;
@@ -322,10 +333,12 @@ struct fuse_args {
bool page_replace:1;
bool may_block:1;
bool is_ext:1;
+ bool is_filter:1;
bool is_lookup:1;
bool via_ioctl:1;
- struct fuse_in_arg in_args[3];
- struct fuse_arg out_args[2];
+ uint32_t ret;
+ struct fuse_in_arg in_args[FUSE_EXTENDED_MAX_ARGS_IN];
+ struct fuse_arg out_args[FUSE_EXTENDED_MAX_ARGS_OUT];
void (*end)(struct fuse_mount *fm, struct fuse_args *args, int error);
};

@@ -1165,6 +1178,22 @@ void fuse_request_end(struct fuse_req *req);
void fuse_abort_conn(struct fuse_conn *fc);
void fuse_wait_aborted(struct fuse_conn *fc);

+/**
+ * Allocated/Reallocate extended header information
+ * Returns pointer to start of most recent allocation
+ */
+void *extend_arg(struct fuse_in_arg *buf, u32 bytes);
+
+/**
+ * Returns adjusted size field for extensions
+ */
+u32 fuse_ext_size(size_t size);
+
+/**
+ * Free allocated extended header information
+ */
+void free_ext_value(struct fuse_args *args);
+
/**
* Invalidate inode attributes
*/
diff --git a/include/linux/bpf_fuse.h b/include/linux/bpf_fuse.h
index 2183a7a45c92..159b850e1b46 100644
--- a/include/linux/bpf_fuse.h
+++ b/include/linux/bpf_fuse.h
@@ -64,6 +64,7 @@ struct bpf_fuse_args {
uint32_t in_numargs;
uint32_t out_numargs;
uint32_t flags;
+ uint32_t ret;
struct bpf_fuse_arg in_args[FUSE_MAX_ARGS_IN];
struct bpf_fuse_arg out_args[FUSE_MAX_ARGS_OUT];
};
diff --git a/include/uapi/linux/fuse.h b/include/uapi/linux/fuse.h
index e779064f5fad..bbcda421ee8e 100644
--- a/include/uapi/linux/fuse.h
+++ b/include/uapi/linux/fuse.h
@@ -520,6 +520,7 @@ enum fuse_ext_type {
/* Types 0..31 are reserved for fuse_secctx_header */
FUSE_MAX_NR_SECCTX = 31,
FUSE_EXT_GROUPS = 32,
+ FUSE_ERROR_IN = 33,
};

enum fuse_opcode {
--
2.40.0.634.g4ca3ef3211-goog

2023-04-18 01:54:06

by Daniel Rosenberg

[permalink] [raw]
Subject: [RFC PATCH v3 36/37] fuse-bpf: Add selftests

Adds basic selftests for fuse. These check that you can add fuse_op
programs, and perform basic operations

Signed-off-by: Daniel Rosenberg <[email protected]>
Signed-off-by: Paul Lawrence <[email protected]>
Signed-off-by: Alessio Balsini <[email protected]>

---
.../selftests/filesystems/fuse/.gitignore | 2 +
.../selftests/filesystems/fuse/Makefile | 188 ++
.../testing/selftests/filesystems/fuse/OWNERS | 2 +
.../selftests/filesystems/fuse/bpf_common.h | 51 +
.../selftests/filesystems/fuse/bpf_loader.c | 597 ++++
.../testing/selftests/filesystems/fuse/fd.txt | 21 +
.../selftests/filesystems/fuse/fd_bpf.bpf.c | 397 +++
.../selftests/filesystems/fuse/fuse_daemon.c | 300 ++
.../selftests/filesystems/fuse/fuse_test.c | 2412 +++++++++++++++++
.../selftests/filesystems/fuse/test.bpf.c | 996 +++++++
.../filesystems/fuse/test_framework.h | 172 ++
.../selftests/filesystems/fuse/test_fuse.h | 494 ++++
12 files changed, 5632 insertions(+)
create mode 100644 tools/testing/selftests/filesystems/fuse/.gitignore
create mode 100644 tools/testing/selftests/filesystems/fuse/Makefile
create mode 100644 tools/testing/selftests/filesystems/fuse/OWNERS
create mode 100644 tools/testing/selftests/filesystems/fuse/bpf_common.h
create mode 100644 tools/testing/selftests/filesystems/fuse/bpf_loader.c
create mode 100644 tools/testing/selftests/filesystems/fuse/fd.txt
create mode 100644 tools/testing/selftests/filesystems/fuse/fd_bpf.bpf.c
create mode 100644 tools/testing/selftests/filesystems/fuse/fuse_daemon.c
create mode 100644 tools/testing/selftests/filesystems/fuse/fuse_test.c
create mode 100644 tools/testing/selftests/filesystems/fuse/test.bpf.c
create mode 100644 tools/testing/selftests/filesystems/fuse/test_framework.h
create mode 100644 tools/testing/selftests/filesystems/fuse/test_fuse.h

diff --git a/tools/testing/selftests/filesystems/fuse/.gitignore b/tools/testing/selftests/filesystems/fuse/.gitignore
new file mode 100644
index 000000000000..3ee9a27fe66a
--- /dev/null
+++ b/tools/testing/selftests/filesystems/fuse/.gitignore
@@ -0,0 +1,2 @@
+fuse_test
+*.raw
diff --git a/tools/testing/selftests/filesystems/fuse/Makefile b/tools/testing/selftests/filesystems/fuse/Makefile
new file mode 100644
index 000000000000..b2df4dec0651
--- /dev/null
+++ b/tools/testing/selftests/filesystems/fuse/Makefile
@@ -0,0 +1,188 @@
+# SPDX-License-Identifier: GPL-2.0
+include ../../../../build/Build.include
+include ../../../../scripts/Makefile.arch
+include ../../../../scripts/Makefile.include
+
+#if 0
+ifneq ($(LLVM),)
+ifneq ($(filter %/,$(LLVM)),)
+LLVM_PREFIX := $(LLVM)
+else ifneq ($(filter -%,$(LLVM)),)
+LLVM_SUFFIX := $(LLVM)
+endif
+
+CLANG_TARGET_FLAGS_arm := arm-linux-gnueabi
+CLANG_TARGET_FLAGS_arm64 := aarch64-linux-gnu
+CLANG_TARGET_FLAGS_hexagon := hexagon-linux-musl
+CLANG_TARGET_FLAGS_m68k := m68k-linux-gnu
+CLANG_TARGET_FLAGS_mips := mipsel-linux-gnu
+CLANG_TARGET_FLAGS_powerpc := powerpc64le-linux-gnu
+CLANG_TARGET_FLAGS_riscv := riscv64-linux-gnu
+CLANG_TARGET_FLAGS_s390 := s390x-linux-gnu
+CLANG_TARGET_FLAGS_x86 := x86_64-linux-gnu
+CLANG_TARGET_FLAGS := $(CLANG_TARGET_FLAGS_$(ARCH))
+#endif
+
+ifeq ($(CROSS_COMPILE),)
+ifeq ($(CLANG_TARGET_FLAGS),)
+$(error Specify CROSS_COMPILE or add '--target=' option to lib.mk
+else
+CLANG_FLAGS += --target=$(CLANG_TARGET_FLAGS)
+endif # CLANG_TARGET_FLAGS
+else
+CLANG_FLAGS += --target=$(notdir $(CROSS_COMPILE:%-=%))
+endif # CROSS_COMPILE
+
+CC := $(LLVM_PREFIX)clang$(LLVM_SUFFIX) $(CLANG_FLAGS) -fintegrated-as
+else
+CC := $(CROSS_COMPILE)gcc
+endif # LLVM
+
+CURDIR := $(abspath .)
+TOOLSDIR := $(abspath ../../../..)
+LIBDIR := $(TOOLSDIR)/lib
+BPFDIR := $(LIBDIR)/bpf
+TOOLSINCDIR := $(TOOLSDIR)/include
+BPFTOOLDIR := $(TOOLSDIR)/bpf/bpftool
+APIDIR := $(TOOLSINCDIR)/uapi
+GENDIR := $(abspath ../../../../../include/generated)
+GENHDR := $(GENDIR)/autoconf.h
+SELFTESTS:=$(TOOLSDIR)/testing/selftests/
+
+LDLIBS := -lpthread -lelf -lz
+TEST_GEN_PROGS := fuse_test fuse_daemon
+TEST_GEN_FILES := \
+ test.skel.h \
+ fd.sh \
+
+include ../../lib.mk
+
+# Put after include ../../lib.mk since that changes $(TEST_GEN_PROGS)
+# Otherwise you get multiple targets, this becomes the default, and it's a mess
+EXTRA_SOURCES := bpf_loader.c $(OUTPUT)/test.skel.h
+$(TEST_GEN_PROGS) : $(EXTRA_SOURCES) $(BPFOBJ)
+
+SCRATCH_DIR := $(OUTPUT)/tools
+BUILD_DIR := $(SCRATCH_DIR)/build
+INCLUDE_DIR := $(SCRATCH_DIR)/include
+BPFOBJ := $(BUILD_DIR)/libbpf/libbpf.a
+SKEL_DIR := $(OUTPUT)
+ifneq ($(CROSS_COMPILE),)
+HOST_BUILD_DIR := $(BUILD_DIR)/host
+HOST_SCRATCH_DIR := host-tools
+HOST_INCLUDE_DIR := $(HOST_SCRATCH_DIR)/include
+else
+HOST_BUILD_DIR := $(BUILD_DIR)
+HOST_SCRATCH_DIR := $(SCRATCH_DIR)
+HOST_INCLUDE_DIR := $(INCLUDE_DIR)
+endif
+HOST_BPFOBJ := $(HOST_BUILD_DIR)/libbpf/libbpf.a
+RESOLVE_BTFIDS := $(HOST_BUILD_DIR)/resolve_btfids/resolve_btfids
+DEFAULT_BPFTOOL := $(HOST_SCRATCH_DIR)/sbin/bpftool
+
+VMLINUX_BTF_PATHS ?= $(if $(OUTPUT),$(OUTPUT)/../../../../../vmlinux) \
+ $(if $(KBUILD_OUTPUT),$(KBUILD_OUTPUT)/vmlinux) \
+ ../../../../../vmlinux \
+ /sys/kernel/btf/vmlinux \
+ /boot/vmlinux-$(shell uname -r)
+VMLINUX_BTF ?= $(abspath $(firstword $(wildcard $(VMLINUX_BTF_PATHS))))
+ifeq ($(VMLINUX_BTF),)
+$(error Cannot find a vmlinux for VMLINUX_BTF at any of "$(VMLINUX_BTF_PATHS)")
+endif
+
+BPFTOOL ?= $(DEFAULT_BPFTOOL)
+
+ifneq ($(wildcard $(GENHDR)),)
+ GENFLAGS := -DHAVE_GENHDR
+endif
+
+CFLAGS += -g -O2 -rdynamic -pthread -Wall -Werror $(GENFLAGS) \
+ -I$(INCLUDE_DIR) -I$(GENDIR) -I$(LIBDIR) \
+ -I$(TOOLSINCDIR) -I$(APIDIR) -I$(SELFTESTS) \
+ -I$(SKEL_DIR)
+
+# Silence some warnings when compiled with clang
+ifneq ($(LLVM),)
+CFLAGS += -Wno-unused-command-line-argument
+endif
+
+#LDFLAGS = -lelf -lz
+
+IS_LITTLE_ENDIAN = $(shell $(CC) -dM -E - </dev/null | \
+ grep 'define __BYTE_ORDER__ __ORDER_LITTLE_ENDIAN__')
+
+# Get Clang's default includes on this system, as opposed to those seen by
+# '-target bpf'. This fixes "missing" files on some architectures/distros,
+# such as asm/byteorder.h, asm/socket.h, asm/sockios.h, sys/cdefs.h etc.
+#
+# Use '-idirafter': Don't interfere with include mechanics except where the
+# build would have failed anyways.
+define get_sys_includes
+$(shell $(1) -v -E - </dev/null 2>&1 \
+ | sed -n '/<...> search starts here:/,/End of search list./{ s| \(/.*\)|-idirafter \1|p }') \
+$(shell $(1) -dM -E - </dev/null | grep '__riscv_xlen ' | awk '{printf("-D__riscv_xlen=%d -D__BITS_PER_LONG=%d", $$3, $$3)}')
+endef
+
+BPF_CFLAGS = -g -D__TARGET_ARCH_$(SRCARCH) \
+ $(if $(IS_LITTLE_ENDIAN),-mlittle-endian,-mbig-endian) \
+ -I$(INCLUDE_DIR) -I$(CURDIR) -I$(APIDIR) \
+ -I../../../../../include \
+ $(call get_sys_includes,$(CLANG)) \
+ -Wno-compare-distinct-pointer-types \
+ -O2 -mcpu=v3
+
+# sort removes libbpf duplicates when not cross-building
+MAKE_DIRS := $(sort $(BUILD_DIR)/libbpf $(HOST_BUILD_DIR)/libbpf \
+ $(HOST_BUILD_DIR)/bpftool $(HOST_BUILD_DIR)/resolve_btfids \
+ $(INCLUDE_DIR))
+
+$(MAKE_DIRS):
+ $(call msg,MKDIR,,$@)
+ $(Q)mkdir -p $@
+
+$(BPFOBJ): $(wildcard $(BPFDIR)/*.[ch] $(BPFDIR)/Makefile) \
+ $(APIDIR)/linux/bpf.h \
+ | $(BUILD_DIR)/libbpf
+ $(Q)$(MAKE) $(submake_extras) -C $(BPFDIR) OUTPUT=$(BUILD_DIR)/libbpf/ \
+ EXTRA_CFLAGS='-g -O0' \
+ DESTDIR=$(SCRATCH_DIR) prefix= all install_headers
+
+$(DEFAULT_BPFTOOL): $(wildcard $(BPFTOOLDIR)/*.[ch] $(BPFTOOLDIR)/Makefile) \
+ $(HOST_BPFOBJ) | $(HOST_BUILD_DIR)/bpftool
+ $(Q)$(MAKE) $(submake_extras) -C $(BPFTOOLDIR) \
+ ARCH= CROSS_COMPILE= CC=$(HOSTCC) LD=$(HOSTLD) \
+ EXTRA_CFLAGS='-g -O0' \
+ OUTPUT=$(HOST_BUILD_DIR)/bpftool/ \
+ LIBBPF_OUTPUT=$(HOST_BUILD_DIR)/libbpf/ \
+ LIBBPF_DESTDIR=$(HOST_SCRATCH_DIR)/ \
+ prefix= DESTDIR=$(HOST_SCRATCH_DIR)/ install-bin
+
+$(INCLUDE_DIR)/vmlinux.h: $(VMLINUX_BTF) $(BPFTOOL) | $(INCLUDE_DIR)
+ifeq ($(VMLINUX_H),)
+ $(call msg,GEN,,$@)
+ $(Q)$(BPFTOOL) btf dump file $(VMLINUX_BTF) format c > $@
+else
+ $(call msg,CP,,$@)
+ $(Q)cp "$(VMLINUX_H)" $@
+endif
+
+$(OUTPUT)/fuse_daemon: LDLIBS := $(HOST_BPFOBJ) $(LDLIBS)
+$(OUTPUT)/fuse_test: LDLIBS := $(HOST_BPFOBJ) $(LDLIBS)
+
+$(OUTPUT)/%.bpf.o: %.bpf.c $(INCLUDE_DIR)/vmlinux.h \
+ | $(BPFOBJ)
+ $(call msg,CLNG-BPF,,$@)
+ $(Q)$(CLANG) $(BPF_CFLAGS) -target bpf -c $< -o $@
+
+$(OUTPUT)/%.skel.h: $(OUTPUT)/%.bpf.o $(BPFTOOL)
+ $(call msg,GEN-SKEL,,$@)
+ $(Q)$(BPFTOOL) gen object $(<:.o=.linked1.o) $<
+ $(Q)$(BPFTOOL) gen object $(<:.o=.linked2.o) $(<:.o=.linked1.o)
+ $(Q)$(BPFTOOL) gen object $(<:.o=.linked3.o) $(<:.o=.linked2.o)
+ $(Q)diff $(<:.o=.linked2.o) $(<:.o=.linked3.o)
+ $(Q)$(BPFTOOL) gen skeleton $(<:.o=.linked3.o) name $(notdir $(<:.bpf.o=))_bpf > $@
+ $(Q)$(BPFTOOL) gen subskeleton $(<:.o=.linked3.o) name $(notdir $(<:.bpf.o=))_bpf > $(@:.skel.h=.subskel.h)
+
+$(OUTPUT)/fd.sh: fd.txt
+ cp $< $@
+ chmod 755 $@
diff --git a/tools/testing/selftests/filesystems/fuse/OWNERS b/tools/testing/selftests/filesystems/fuse/OWNERS
new file mode 100644
index 000000000000..5eb371e1a5a3
--- /dev/null
+++ b/tools/testing/selftests/filesystems/fuse/OWNERS
@@ -0,0 +1,2 @@
+# include OWNERS from the authoritative android-mainline branch
+include kernel/common:android-mainline:/tools/testing/selftests/filesystems/incfs/OWNERS
diff --git a/tools/testing/selftests/filesystems/fuse/bpf_common.h b/tools/testing/selftests/filesystems/fuse/bpf_common.h
new file mode 100644
index 000000000000..dcf9efaef0f4
--- /dev/null
+++ b/tools/testing/selftests/filesystems/fuse/bpf_common.h
@@ -0,0 +1,51 @@
+// TODO: Insert description here. (generated by drosen)
+
+#ifndef _BPF_COMMON_H_
+#define _BPF_COMMON_H_
+
+/* Return Codes for Fuse BPF programs */
+#define BPF_FUSE_CONTINUE 0
+#define BPF_FUSE_USER 1
+#define BPF_FUSE_USER_PREFILTER 2
+#define BPF_FUSE_POSTFILTER 3
+#define BPF_FUSE_USER_POSTFILTER 4
+
+enum fuse_bpf_type {
+ FUSE_ENTRY_BACKING = 1,
+ FUSE_ENTRY_BPF = 2,
+ FUSE_ENTRY_REMOVE_BACKING = 3,
+ FUSE_ENTRY_REMOVE_BPF = 4,
+};
+
+#define BPF_FUSE_NAME_MAX 15
+struct fuse_bpf_entry_out {
+ uint32_t entry_type;
+ uint32_t unused;
+ union {
+ struct {
+ uint64_t unused2;
+ uint64_t fd;
+ };
+ char name[BPF_FUSE_NAME_MAX + 1];
+ };
+};
+
+/* Op Code Filter values for BPF Programs */
+#define FUSE_OPCODE_FILTER 0x0ffff
+#define FUSE_PREFILTER 0x10000
+#define FUSE_POSTFILTER 0x20000
+
+#define BPF_FUSE_NAME_MAX 15
+
+#define BPF_STRUCT_OPS(type, name, args...) \
+SEC("struct_ops/"#name) \
+type BPF_PROG(name, ##args)
+
+/* available kfuncs for fuse_bpf */
+extern uint32_t bpf_fuse_return_len(struct fuse_buffer *ptr) __ksym;
+extern void bpf_fuse_get_rw_dynptr(struct fuse_buffer *buffer, struct bpf_dynptr *dynptr, u64 size, bool copy) __ksym;
+extern void bpf_fuse_get_ro_dynptr(const struct fuse_buffer *buffer, struct bpf_dynptr *dynptr) __ksym;
+extern void *bpf_dynptr_slice(const struct bpf_dynptr *ptr, u32 offset, void *buffer, u32 buffer__szk) __ksym;
+extern void *bpf_dynptr_slice_rdwr(const struct bpf_dynptr *ptr, u32 offset, void *buffer, u32 buffer__szk) __ksym;
+
+#endif /* _BPF_COMMON_H_ */
diff --git a/tools/testing/selftests/filesystems/fuse/bpf_loader.c b/tools/testing/selftests/filesystems/fuse/bpf_loader.c
new file mode 100644
index 000000000000..ebcced7f9430
--- /dev/null
+++ b/tools/testing/selftests/filesystems/fuse/bpf_loader.c
@@ -0,0 +1,597 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright 2021 Google LLC
+ */
+
+#include "test_fuse.h"
+
+#include <dirent.h>
+#include <errno.h>
+#include <fcntl.h>
+#include <gelf.h>
+#include <libelf.h>
+#include <stdlib.h>
+#include <string.h>
+#include <unistd.h>
+
+#include <sys/mount.h>
+#include <sys/stat.h>
+#include <sys/statfs.h>
+#include <sys/xattr.h>
+
+#include <linux/unistd.h>
+
+#include <uapi/linux/fuse.h>
+#include <uapi/linux/bpf.h>
+
+struct _test_options test_options;
+
+struct s s(const char *s1)
+{
+ struct s s = {0};
+
+ if (!s1)
+ return s;
+
+ s.s = malloc(strlen(s1) + 1);
+ if (!s.s)
+ return s;
+
+ strcpy(s.s, s1);
+ return s;
+}
+
+struct s sn(const char *s1, const char *s2)
+{
+ struct s s = {0};
+
+ if (!s1)
+ return s;
+
+ s.s = malloc(s2 - s1 + 1);
+ if (!s.s)
+ return s;
+
+ strncpy(s.s, s1, s2 - s1);
+ s.s[s2 - s1] = 0;
+ return s;
+}
+
+int s_cmp(struct s s1, struct s s2)
+{
+ int result = -1;
+
+ if (!s1.s || !s2.s)
+ goto out;
+ result = strcmp(s1.s, s2.s);
+out:
+ free(s1.s);
+ free(s2.s);
+ return result;
+}
+
+struct s s_cat(struct s s1, struct s s2)
+{
+ struct s s = {0};
+
+ if (!s1.s || !s2.s)
+ goto out;
+
+ s.s = malloc(strlen(s1.s) + strlen(s2.s) + 1);
+ if (!s.s)
+ goto out;
+
+ strcpy(s.s, s1.s);
+ strcat(s.s, s2.s);
+out:
+ free(s1.s);
+ free(s2.s);
+ return s;
+}
+
+struct s s_splitleft(struct s s1, char c)
+{
+ struct s s = {0};
+ char *split;
+
+ if (!s1.s)
+ return s;
+
+ split = strchr(s1.s, c);
+ if (split)
+ s = sn(s1.s, split);
+
+ free(s1.s);
+ return s;
+}
+
+struct s s_splitright(struct s s1, char c)
+{
+ struct s s2 = {0};
+ char *split;
+
+ if (!s1.s)
+ return s2;
+
+ split = strchr(s1.s, c);
+ if (split)
+ s2 = s(split + 1);
+
+ free(s1.s);
+ return s2;
+}
+
+struct s s_word(struct s s1, char c, size_t n)
+{
+ while (n--)
+ s1 = s_splitright(s1, c);
+ return s_splitleft(s1, c);
+}
+
+struct s s_path(struct s s1, struct s s2)
+{
+ return s_cat(s_cat(s1, s("/")), s2);
+}
+
+struct s s_pathn(size_t n, struct s s1, ...)
+{
+ va_list argp;
+
+ va_start(argp, s1);
+ while (--n)
+ s1 = s_path(s1, va_arg(argp, struct s));
+ va_end(argp);
+ return s1;
+}
+
+int s_link(struct s src_pathname, struct s dst_pathname)
+{
+ int res;
+
+ if (src_pathname.s && dst_pathname.s) {
+ res = link(src_pathname.s, dst_pathname.s);
+ } else {
+ res = -1;
+ errno = ENOMEM;
+ }
+
+ free(src_pathname.s);
+ free(dst_pathname.s);
+ return res;
+}
+
+int s_symlink(struct s src_pathname, struct s dst_pathname)
+{
+ int res;
+
+ if (src_pathname.s && dst_pathname.s) {
+ res = symlink(src_pathname.s, dst_pathname.s);
+ } else {
+ res = -1;
+ errno = ENOMEM;
+ }
+
+ free(src_pathname.s);
+ free(dst_pathname.s);
+ return res;
+}
+
+
+int s_mkdir(struct s pathname, mode_t mode)
+{
+ int res;
+
+ if (!pathname.s) {
+ errno = ENOMEM;
+ return -1;
+ }
+
+ res = mkdir(pathname.s, mode);
+ free(pathname.s);
+ return res;
+}
+
+int s_rmdir(struct s pathname)
+{
+ int res;
+
+ if (!pathname.s) {
+ errno = ENOMEM;
+ return -1;
+ }
+
+ res = rmdir(pathname.s);
+ free(pathname.s);
+ return res;
+}
+
+int s_unlink(struct s pathname)
+{
+ int res;
+
+ if (!pathname.s) {
+ errno = ENOMEM;
+ return -1;
+ }
+
+ res = unlink(pathname.s);
+ free(pathname.s);
+ return res;
+}
+
+int s_open(struct s pathname, int flags, ...)
+{
+ va_list ap;
+ int res;
+
+ va_start(ap, flags);
+ if (!pathname.s) {
+ errno = ENOMEM;
+ return -1;
+ }
+
+ if (flags & (O_CREAT | O_TMPFILE))
+ res = open(pathname.s, flags, va_arg(ap, mode_t));
+ else
+ res = open(pathname.s, flags);
+
+ free(pathname.s);
+ va_end(ap);
+ return res;
+}
+
+int s_openat(int dirfd, struct s pathname, int flags, ...)
+{
+ va_list ap;
+ int res;
+
+ va_start(ap, flags);
+ if (!pathname.s) {
+ errno = ENOMEM;
+ return -1;
+ }
+
+ if (flags & (O_CREAT | O_TMPFILE))
+ res = openat(dirfd, pathname.s, flags, va_arg(ap, mode_t));
+ else
+ res = openat(dirfd, pathname.s, flags);
+
+ free(pathname.s);
+ va_end(ap);
+ return res;
+}
+
+int s_creat(struct s pathname, mode_t mode)
+{
+ int res;
+
+ if (!pathname.s) {
+ errno = ENOMEM;
+ return -1;
+ }
+
+ res = open(pathname.s, O_WRONLY | O_CREAT | O_TRUNC | O_CLOEXEC, mode);
+ free(pathname.s);
+ return res;
+}
+
+int s_mkfifo(struct s pathname, mode_t mode)
+{
+ int res;
+
+ if (!pathname.s) {
+ errno = ENOMEM;
+ return -1;
+ }
+
+ res = mknod(pathname.s, S_IFIFO | mode, 0);
+ free(pathname.s);
+ return res;
+}
+
+int s_stat(struct s pathname, struct stat *st)
+{
+ int res;
+
+ if (!pathname.s) {
+ errno = ENOMEM;
+ return -1;
+ }
+
+ res = stat(pathname.s, st);
+ free(pathname.s);
+ return res;
+}
+
+int s_statfs(struct s pathname, struct statfs *st)
+{
+ int res;
+
+ if (!pathname.s) {
+ errno = ENOMEM;
+ return -1;
+ }
+
+ res = statfs(pathname.s, st);
+ free(pathname.s);
+ return res;
+}
+
+DIR *s_opendir(struct s pathname)
+{
+ DIR *res;
+
+ res = opendir(pathname.s);
+ free(pathname.s);
+ return res;
+}
+
+int s_getxattr(struct s pathname, const char name[], void *value, size_t size,
+ ssize_t *ret_size)
+{
+ if (!pathname.s) {
+ errno = ENOMEM;
+ return -1;
+ }
+
+ *ret_size = getxattr(pathname.s, name, value, size);
+ free(pathname.s);
+ return *ret_size >= 0 ? 0 : -1;
+}
+
+int s_listxattr(struct s pathname, void *list, size_t size, ssize_t *ret_size)
+{
+ if (!pathname.s) {
+ errno = ENOMEM;
+ return -1;
+ }
+
+ *ret_size = listxattr(pathname.s, list, size);
+ free(pathname.s);
+ return *ret_size >= 0 ? 0 : -1;
+}
+
+int s_setxattr(struct s pathname, const char name[], const void *value, size_t size, int flags)
+{
+ int res;
+
+ if (!pathname.s) {
+ errno = ENOMEM;
+ return -1;
+ }
+
+ res = setxattr(pathname.s, name, value, size, flags);
+ free(pathname.s);
+ return res;
+}
+
+int s_removexattr(struct s pathname, const char name[])
+{
+ int res;
+
+ if (!pathname.s) {
+ errno = ENOMEM;
+ return -1;
+ }
+
+ res = removexattr(pathname.s, name);
+ free(pathname.s);
+ return res;
+}
+
+int s_rename(struct s oldpathname, struct s newpathname)
+{
+ int res;
+
+ if (!oldpathname.s || !newpathname.s) {
+ errno = ENOMEM;
+ return -1;
+ }
+
+ res = rename(oldpathname.s, newpathname.s);
+ free(oldpathname.s);
+ free(newpathname.s);
+ return res;
+}
+
+int s_fuse_attr(struct s pathname, struct fuse_attr *fuse_attr_out)
+{
+
+ struct stat st;
+ int result = TEST_FAILURE;
+
+ TESTSYSCALL(s_stat(pathname, &st));
+
+ fuse_attr_out->ino = st.st_ino;
+ fuse_attr_out->mode = st.st_mode;
+ fuse_attr_out->nlink = st.st_nlink;
+ fuse_attr_out->uid = st.st_uid;
+ fuse_attr_out->gid = st.st_gid;
+ fuse_attr_out->rdev = st.st_rdev;
+ fuse_attr_out->size = st.st_size;
+ fuse_attr_out->blksize = st.st_blksize;
+ fuse_attr_out->blocks = st.st_blocks;
+ fuse_attr_out->atime = st.st_atime;
+ fuse_attr_out->mtime = st.st_mtime;
+ fuse_attr_out->ctime = st.st_ctime;
+ fuse_attr_out->atimensec = UINT32_MAX;
+ fuse_attr_out->mtimensec = UINT32_MAX;
+ fuse_attr_out->ctimensec = UINT32_MAX;
+
+ result = TEST_SUCCESS;
+out:
+ return result;
+}
+
+struct s tracing_folder(void)
+{
+ struct s trace = {0};
+ FILE *mounts = NULL;
+ char *line = NULL;
+ size_t size = 0;
+
+ TEST(mounts = fopen("/proc/mounts", "re"), mounts);
+ while (getline(&line, &size, mounts) != -1) {
+ if (!s_cmp(s_word(sn(line, line + size), ' ', 2),
+ s("tracefs"))) {
+ trace = s_word(sn(line, line + size), ' ', 1);
+ break;
+ }
+
+ if (!s_cmp(s_word(sn(line, line + size), ' ', 2), s("debugfs")))
+ trace = s_path(s_word(sn(line, line + size), ' ', 1),
+ s("tracing"));
+ }
+
+out:
+ free(line);
+ fclose(mounts);
+ return trace;
+}
+
+int tracing_on(void)
+{
+ int result = TEST_FAILURE;
+ int tracing_on = -1;
+
+ TEST(tracing_on = s_open(s_path(tracing_folder(), s("tracing_on")),
+ O_WRONLY | O_CLOEXEC),
+ tracing_on != -1);
+ TESTEQUAL(write(tracing_on, "1", 1), 1);
+ result = TEST_SUCCESS;
+out:
+ close(tracing_on);
+ return result;
+}
+
+char *concat_file_name(const char *dir, const char *file)
+{
+ char full_name[FILENAME_MAX] = "";
+
+ if (snprintf(full_name, ARRAY_SIZE(full_name), "%s/%s", dir, file) < 0)
+ return NULL;
+ return strdup(full_name);
+}
+
+char *setup_mount_dir(const char *name)
+{
+ struct stat st;
+ char *current_dir = getcwd(NULL, 0);
+ char *mount_dir = concat_file_name(current_dir, name);
+
+ free(current_dir);
+ if (stat(mount_dir, &st) == 0) {
+ if (S_ISDIR(st.st_mode))
+ return mount_dir;
+
+ ksft_print_msg("%s is a file, not a dir.\n", mount_dir);
+ return NULL;
+ }
+
+ if (mkdir(mount_dir, 0777)) {
+ ksft_print_msg("Can't create mount dir.");
+ return NULL;
+ }
+
+ return mount_dir;
+}
+
+int delete_dir_tree(const char *dir_path, bool remove_root)
+{
+ DIR *dir = NULL;
+ struct dirent *dp;
+ int result = 0;
+
+ dir = opendir(dir_path);
+ if (!dir) {
+ result = -errno;
+ goto out;
+ }
+
+ while ((dp = readdir(dir))) {
+ char *full_path;
+
+ if (!strcmp(dp->d_name, ".") || !strcmp(dp->d_name, ".."))
+ continue;
+
+ full_path = concat_file_name(dir_path, dp->d_name);
+ if (dp->d_type == DT_DIR)
+ result = delete_dir_tree(full_path, true);
+ else
+ result = unlink(full_path);
+ free(full_path);
+ if (result)
+ goto out;
+ }
+
+out:
+ if (dir)
+ closedir(dir);
+ if (!result && remove_root)
+ rmdir(dir_path);
+ return result;
+}
+
+static int mount_fuse_maybe_init(const char *mount_dir, const char *bpf_name, int dir_fd,
+ int *fuse_dev_ptr, bool init)
+{
+ int result = TEST_FAILURE;
+ int fuse_dev = -1;
+ char options[FILENAME_MAX];
+ uint8_t bytes_in[FUSE_MIN_READ_BUFFER];
+ uint8_t bytes_out[FUSE_MIN_READ_BUFFER];
+
+ DECL_FUSE_IN(init);
+
+ TEST(fuse_dev = open("/dev/fuse", O_RDWR | O_CLOEXEC), fuse_dev != -1);
+ snprintf(options, FILENAME_MAX, "fd=%d,user_id=0,group_id=0,rootmode=0040000",
+ fuse_dev);
+ if (bpf_name != NULL)
+ snprintf(options + strlen(options),
+ sizeof(options) - strlen(options),
+ ",root_bpf=%s", bpf_name);
+ if (dir_fd != -1)
+ snprintf(options + strlen(options),
+ sizeof(options) - strlen(options),
+ ",root_dir=%d", dir_fd);
+ TESTSYSCALL(mount("ABC", mount_dir, "fuse", 0, options));
+
+ if (init) {
+ TESTFUSEIN(FUSE_INIT, init_in);
+ TESTEQUAL(init_in->major, FUSE_KERNEL_VERSION);
+ TESTEQUAL(init_in->minor, FUSE_KERNEL_MINOR_VERSION);
+ TESTFUSEOUT1(fuse_init_out, ((struct fuse_init_out) {
+ .major = FUSE_KERNEL_VERSION,
+ .minor = FUSE_KERNEL_MINOR_VERSION,
+ .max_readahead = 4096,
+ .flags = 0,
+ .max_background = 0,
+ .congestion_threshold = 0,
+ .max_write = 4096,
+ .time_gran = 1000,
+ .max_pages = 12,
+ .map_alignment = 4096,
+ }));
+ }
+
+ *fuse_dev_ptr = fuse_dev;
+ fuse_dev = -1;
+ result = TEST_SUCCESS;
+out:
+ close(fuse_dev);
+ return result;
+}
+
+int mount_fuse(const char *mount_dir, const char * bpf_name, int dir_fd, int *fuse_dev_ptr)
+{
+ return mount_fuse_maybe_init(mount_dir, bpf_name, dir_fd, fuse_dev_ptr,
+ true);
+}
+
+int mount_fuse_no_init(const char *mount_dir, const char * bpf_name, int dir_fd,
+ int *fuse_dev_ptr)
+{
+ return mount_fuse_maybe_init(mount_dir, bpf_name, dir_fd, fuse_dev_ptr,
+ false);
+}
+
diff --git a/tools/testing/selftests/filesystems/fuse/fd.txt b/tools/testing/selftests/filesystems/fuse/fd.txt
new file mode 100644
index 000000000000..15ce77180d55
--- /dev/null
+++ b/tools/testing/selftests/filesystems/fuse/fd.txt
@@ -0,0 +1,21 @@
+fuse_daemon $*
+cd fd-dst
+ls
+cd show
+ls
+fsstress -s 123 -d . -p 4 -n 100 -l5
+echo test > wibble
+ls
+cat wibble
+fallocate -l 1000 wobble
+mkdir testdir
+mkdir tmpdir
+rmdir tmpdir
+touch tmp
+mv tmp tmp2
+rm tmp2
+
+# FUSE_LINK
+echo "ln_src contents" > ln_src
+ln ln_src ln_link
+cat ln_link
diff --git a/tools/testing/selftests/filesystems/fuse/fd_bpf.bpf.c b/tools/testing/selftests/filesystems/fuse/fd_bpf.bpf.c
new file mode 100644
index 000000000000..9b6377b96a6e
--- /dev/null
+++ b/tools/testing/selftests/filesystems/fuse/fd_bpf.bpf.c
@@ -0,0 +1,397 @@
+// SPDX-License-Identifier: GPL-2.0 OR BSD-3-Clause
+// Copyright (c) 2021 Google LLC
+
+//#define __EXPORTED_HEADERS__
+//#define __KERNEL__
+
+//#include <uapi/linux/bpf.h>
+//#include <linux/fuse.h>
+
+#include "vmlinux.h"
+#include <linux/errno.h>
+#include <linux/types.h>
+#include <bpf/bpf_tracing.h>
+#include <bpf/bpf_helpers.h>
+
+#include "bpf_common.h"
+
+char _license[] SEC("license") = "GPL";
+
+#if 0
+struct fuse_bpf_map {
+ int map_type;
+ int key_size;
+ int value_size;
+ int max_entries;
+};
+SEC("dummy")
+
+inline int strcmp(const char *a, const char *b)
+{
+ int i;
+
+ for (i = 0; i < __builtin_strlen(b) + 1; ++i)
+ if (a[i] != b[i])
+ return -1;
+
+ return 0;
+}
+
+SEC("maps") struct fuse_bpf_map test_map = {
+ BPF_MAP_TYPE_ARRAY,
+ sizeof(uint32_t),
+ sizeof(uint32_t),
+ 1000,
+};
+
+SEC("maps") struct fuse_bpf_map test_map2 = {
+ BPF_MAP_TYPE_HASH,
+ sizeof(uint32_t),
+ sizeof(uint64_t),
+ 76,
+};
+
+SEC("test_daemon")
+
+int trace_daemon(struct __bpf_fuse_args *fa)
+{
+ uint64_t uid_gid = bpf_get_current_uid_gid();
+ uint32_t uid = uid_gid & 0xffffffff;
+ uint64_t pid_tgid = bpf_get_current_pid_tgid();
+ uint32_t pid = pid_tgid & 0xffffffff;
+ uint32_t key = 23;
+ uint32_t *pvalue;
+
+
+ pvalue = bpf_map_lookup_elem(&test_map, &key);
+ if (pvalue) {
+ uint32_t value = *pvalue;
+
+ bpf_printk("pid %u uid %u value %u", pid, uid, value);
+ value++;
+ bpf_map_update_elem(&test_map, &key, &value, BPF_ANY);
+ }
+
+ switch (fa->opcode) {
+#endif
+BPF_STRUCT_OPS(uint32_t, trace_access_prefilter, const struct bpf_fuse_meta_info *meta,
+ struct fuse_access_in *in)
+{
+ bpf_printk("Access: %d", meta->nodeid);
+ return BPF_FUSE_CONTINUE;
+}
+
+BPF_STRUCT_OPS(uint32_t, trace_getattr_prefilter, const struct bpf_fuse_meta_info *meta,
+ struct fuse_getattr_in *in)
+{
+ bpf_printk("Get Attr %d", in->fh);
+ return BPF_FUSE_CONTINUE;
+}
+
+BPF_STRUCT_OPS(uint32_t, trace_setattr_prefilter, const struct bpf_fuse_meta_info *meta,
+ struct fuse_setattr_in *in)
+{
+ bpf_printk("Set Attr %d", in->fh);
+ return BPF_FUSE_CONTINUE;
+}
+
+BPF_STRUCT_OPS(uint32_t, trace_opendir_prefilter, const struct bpf_fuse_meta_info *meta,
+ struct fuse_open_in *in)
+{
+ bpf_printk("Open Dir: %d", meta->nodeid);
+ return BPF_FUSE_CONTINUE;
+}
+
+BPF_STRUCT_OPS(uint32_t, trace_readdir_prefilter, const struct bpf_fuse_meta_info *meta,
+ struct fuse_read_in *in)
+{
+ bpf_printk("Read Dir: fh: %lu", in->fh, in->offset);
+ return BPF_FUSE_CONTINUE;
+}
+
+BPF_STRUCT_OPS(uint32_t, trace_lookup_prefilter, const struct bpf_fuse_meta_info *meta,
+ struct fuse_buffer *name)
+{
+ struct bpf_dynptr name_ptr;
+ char name_buf[255];
+
+ bpf_fuse_get_ro_dynptr(name, &name_ptr);
+ bpf_dynptr_read(name_buf, 255, &name_ptr, 0, 0);
+ bpf_printk("Lookup: %lx %s", meta->nodeid, name_buf);
+ if (meta->nodeid == 1)
+ return BPF_FUSE_USER_PREFILTER;
+ else
+ return BPF_FUSE_CONTINUE;
+}
+
+BPF_STRUCT_OPS(uint32_t, trace_mknod_prefilter, const struct bpf_fuse_meta_info *meta,
+ struct fuse_mknod_in *in, struct fuse_buffer *name)
+{
+ struct bpf_dynptr name_ptr;
+ char name_buf[255];
+
+ bpf_fuse_get_ro_dynptr(name, &name_ptr);
+ bpf_dynptr_read(name_buf, 255, &name_ptr, 0, 0);
+ bpf_printk("mknod %s %x %x", name_buf, in->rdev | in->mode, in->umask);
+ return BPF_FUSE_CONTINUE;
+}
+
+BPF_STRUCT_OPS(uint32_t, trace_mkdir_prefilter, const struct bpf_fuse_meta_info *meta,
+ struct fuse_mkdir_in *in, struct fuse_buffer *name)
+{
+ struct bpf_dynptr name_ptr;
+ char name_buf[255];
+
+ bpf_fuse_get_ro_dynptr(name, &name_ptr);
+ bpf_dynptr_read(name_buf, 255, &name_ptr, 0, 0);
+ bpf_printk("mkdir: %s %x %x", name_buf, in->mode, in->umask);
+ return BPF_FUSE_CONTINUE;
+}
+
+BPF_STRUCT_OPS(uint32_t, trace_rmdir_prefilter, const struct bpf_fuse_meta_info *meta,
+ struct fuse_buffer *name)
+{
+ struct bpf_dynptr name_ptr;
+ char name_buf[255];
+
+ bpf_fuse_get_ro_dynptr(name, &name_ptr);
+ bpf_dynptr_read(name_buf, 255, &name_ptr, 0, 0);
+ bpf_printk("rmdir: %s", name_buf);
+ return BPF_FUSE_CONTINUE;
+}
+
+BPF_STRUCT_OPS(uint32_t, trace_rename_prefilter, const struct bpf_fuse_meta_info *meta,
+ struct fuse_rename_in *in, struct fuse_buffer *old_name,
+ struct fuse_buffer *new_name)
+{
+ struct bpf_dynptr old_name_ptr;
+ struct bpf_dynptr new_name_ptr;
+ char old_name_buf[255];
+ //char new_name_buf[255];
+
+ bpf_fuse_get_ro_dynptr(old_name, &old_name_ptr);
+ //bpf_fuse_get_ro_dynptr(new_name, &new_name_ptr);
+ bpf_dynptr_read(old_name_buf, 255, &old_name_ptr, 0, 0);
+ //bpf_dynptr_read(new_name_buf, 255, &new_name_ptr, 0, 0);
+ bpf_printk("rename from %s", old_name_buf);
+ //bpf_printk("rename to %s", new_name_buf);
+ return BPF_FUSE_CONTINUE;
+}
+
+BPF_STRUCT_OPS(uint32_t, trace_rename2_prefilter, const struct bpf_fuse_meta_info *meta,
+ struct fuse_rename2_in *in, struct fuse_buffer *old_name,
+ struct fuse_buffer *new_name)
+{
+ struct bpf_dynptr old_name_ptr;
+ //struct bpf_dynptr new_name_ptr;
+ char old_name_buf[255];
+ //char new_name_buf[255];
+
+ bpf_fuse_get_ro_dynptr(old_name, &old_name_ptr);
+ //bpf_fuse_get_ro_dynptr(new_name, &new_name_ptr);
+ bpf_dynptr_read(old_name_buf, 255, &old_name_ptr, 0, 0);
+ //bpf_dynptr_read(new_name_buf, 255, &new_name_ptr, 0, 0);
+ bpf_printk("rename(%x) from %s", in->flags, old_name_buf);
+ //bpf_printk("rename to %s", new_name_buf);
+ return BPF_FUSE_CONTINUE;
+}
+
+BPF_STRUCT_OPS(uint32_t, trace_unlink_prefilter, const struct bpf_fuse_meta_info *meta,
+ struct fuse_buffer *name)
+{
+ struct bpf_dynptr name_ptr;
+ char name_buf[255];
+
+ bpf_fuse_get_ro_dynptr(name, &name_ptr);
+ bpf_dynptr_read(name_buf, 255, &name_ptr, 0, 0);
+ bpf_printk("unlink: %s", name_buf);
+ return BPF_FUSE_CONTINUE;
+}
+
+BPF_STRUCT_OPS(uint32_t, trace_link_prefilter, const struct bpf_fuse_meta_info *meta,
+ struct fuse_link_in *in, struct fuse_buffer *name)
+{
+ struct bpf_dynptr name_ptr;
+ char dst_name[255];
+
+ bpf_fuse_get_ro_dynptr(name, &name_ptr);
+ bpf_dynptr_read(dst_name, 255, &name_ptr, 0, 0);
+ bpf_printk("Link: %d %s", in->oldnodeid, dst_name);
+ return BPF_FUSE_CONTINUE;
+}
+
+BPF_STRUCT_OPS(uint32_t, trace_symlink_prefilter, const struct bpf_fuse_meta_info *meta,
+ struct fuse_buffer *name, struct fuse_buffer *path)
+{
+ struct bpf_dynptr name_ptr;
+ //struct bpf_dynptr path_ptr;
+ char link_name[255];
+ //char link_path[4096];
+
+ bpf_fuse_get_ro_dynptr(name, &name_ptr);
+ //bpf_fuse_get_ro_dynptr(path, &path_ptr);
+ bpf_dynptr_read(link_name, 255, &name_ptr, 0, 0);
+ //bpf_dynptr_read(link_path, 4096, &path_ptr, 0, 0);
+
+ bpf_printk("symlink from %s", link_name);
+ //bpf_printk("symlink to %s", link_path);
+ return BPF_FUSE_CONTINUE;
+}
+
+BPF_STRUCT_OPS(uint32_t, trace_get_link_prefilter, const struct bpf_fuse_meta_info *meta,
+ struct fuse_buffer *name)
+{
+ struct bpf_dynptr name_ptr;
+ char link_name[255];
+
+ bpf_fuse_get_ro_dynptr(name, &name_ptr);
+ bpf_dynptr_read(link_name, 255, &name_ptr, 0, 0);
+ bpf_printk("readlink from %s", link_name);
+ return BPF_FUSE_CONTINUE;
+}
+
+BPF_STRUCT_OPS(uint32_t, trace_release_prefilter, const struct bpf_fuse_meta_info *meta,
+ struct fuse_release_in *in)
+{
+ bpf_printk("Release: %d", in->fh);
+ return BPF_FUSE_CONTINUE;
+}
+
+BPF_STRUCT_OPS(uint32_t, trace_releasedir_prefilter, const struct bpf_fuse_meta_info *meta,
+ struct fuse_release_in *in)
+{
+ bpf_printk("Release Dir: %d", in->fh);
+ return BPF_FUSE_CONTINUE;
+}
+
+BPF_STRUCT_OPS(uint32_t, trace_create_open_prefilter, const struct bpf_fuse_meta_info *meta,
+ struct fuse_create_in *in, struct fuse_buffer *name)
+{
+ struct bpf_dynptr name_ptr;
+ char name_buf[255];
+
+ bpf_fuse_get_ro_dynptr(name, &name_ptr);
+ bpf_dynptr_read(name_buf, 255, &name_ptr, 0, 0);
+ bpf_printk("Create %s", name_buf);
+ return BPF_FUSE_CONTINUE;
+}
+
+BPF_STRUCT_OPS(uint32_t, trace_open_prefilter, const struct bpf_fuse_meta_info *meta,
+ struct fuse_open_in *in)
+{
+ bpf_printk("Open: %d", meta->nodeid);
+ return BPF_FUSE_CONTINUE;
+}
+
+BPF_STRUCT_OPS(uint32_t, trace_read_iter_prefilter, const struct bpf_fuse_meta_info *meta,
+ struct fuse_read_in *in)
+{
+ bpf_printk("Read: fh: %lu, offset %lu, size %lu",
+ in->fh, in->offset, in->size);
+ return BPF_FUSE_CONTINUE;
+}
+
+BPF_STRUCT_OPS(uint32_t, trace_write_iter_prefilter, const struct bpf_fuse_meta_info *meta,
+ struct fuse_write_in *in)
+{
+ bpf_printk("Write: fh: %lu, offset %lu, size %lu",
+ in->fh, in->offset, in->size);
+ return BPF_FUSE_CONTINUE;
+}
+
+BPF_STRUCT_OPS(uint32_t, trace_flush_prefilter, const struct bpf_fuse_meta_info *meta,
+ struct fuse_flush_in *in)
+{
+ bpf_printk("Flush %d", in->fh);
+ return BPF_FUSE_CONTINUE;
+}
+
+BPF_STRUCT_OPS(uint32_t, trace_file_fallocate_prefilter, const struct bpf_fuse_meta_info *meta,
+ struct fuse_fallocate_in *in)
+{
+ bpf_printk("Fallocate %d %lu", in->fh, in->length);
+ return BPF_FUSE_CONTINUE;
+}
+
+BPF_STRUCT_OPS(uint32_t, trace_getxattr_prefilter, const struct bpf_fuse_meta_info *meta,
+ struct fuse_getxattr_in *in, struct fuse_buffer *name)
+{
+ struct bpf_dynptr name_ptr;
+ char name_buf[255];
+
+ bpf_fuse_get_ro_dynptr(name, &name_ptr);
+ bpf_dynptr_read(name_buf, 255, &name_ptr, 0, 0);
+ bpf_printk("Getxattr %d %s", meta->nodeid, name_buf);
+ return BPF_FUSE_CONTINUE;
+}
+
+BPF_STRUCT_OPS(uint32_t, trace_listxattr_prefilter, const struct bpf_fuse_meta_info *meta,
+ struct fuse_getxattr_in *in)
+{
+ bpf_printk("Listxattr %d %d", meta->nodeid, in->size);
+ return BPF_FUSE_CONTINUE;
+}
+
+BPF_STRUCT_OPS(uint32_t, trace_setxattr_prefilter, const struct bpf_fuse_meta_info *meta,
+ struct fuse_setxattr_in *in, struct fuse_buffer *name,
+ struct fuse_buffer *value)
+{
+ struct bpf_dynptr name_ptr;
+ char name_buf[255];
+
+ bpf_fuse_get_ro_dynptr(name, &name_ptr);
+ bpf_dynptr_read(name_buf, 255, &name_ptr, 0, 0);
+ bpf_printk("Setxattr %d %s", meta->nodeid, name_buf);
+ return BPF_FUSE_CONTINUE;
+}
+
+BPF_STRUCT_OPS(uint32_t, trace_statfs_prefilter, const struct bpf_fuse_meta_info *meta)
+{
+ bpf_printk("statfs %d", meta->nodeid);
+ return BPF_FUSE_CONTINUE;
+}
+
+BPF_STRUCT_OPS(uint32_t, trace_lseek_prefilter, const struct bpf_fuse_meta_info *meta,
+ struct fuse_lseek_in *in)
+{
+ bpf_printk("lseek type:%d, offset:%lld", in->whence, in->offset);
+ return BPF_FUSE_CONTINUE;
+}
+
+SEC(".struct_ops")
+struct fuse_ops trace_ops = {
+ .open_prefilter = (void *)trace_open_prefilter,
+ .opendir_prefilter = (void *)trace_opendir_prefilter,
+ .create_open_prefilter = (void *)trace_create_open_prefilter,
+ .release_prefilter = (void *)trace_release_prefilter,
+ .releasedir_prefilter = (void *)trace_releasedir_prefilter,
+ .flush_prefilter = (void *)trace_flush_prefilter,
+ .lseek_prefilter = (void *)trace_lseek_prefilter,
+ //.copy_file_range_prefilter = (void *)trace_copy_file_range_prefilter,
+ //.fsync_prefilter = (void *)trace_fsync_prefilter,
+ //.dir_fsync_prefilter = (void *)trace_dir_fsync_prefilter,
+ .getxattr_prefilter = (void *)trace_getxattr_prefilter,
+ .listxattr_prefilter = (void *)trace_listxattr_prefilter,
+ .setxattr_prefilter = (void *)trace_setxattr_prefilter,
+ //.removexattr_prefilter = (void *)trace_removexattr_prefilter,
+ .read_iter_prefilter = (void *)trace_read_iter_prefilter,
+ .write_iter_prefilter = (void *)trace_write_iter_prefilter,
+ .file_fallocate_prefilter = (void *)trace_file_fallocate_prefilter,
+ .lookup_prefilter = (void *)trace_lookup_prefilter,
+ .mknod_prefilter = (void *)trace_mknod_prefilter,
+ .mkdir_prefilter = (void *)trace_mkdir_prefilter,
+ .rmdir_prefilter = (void *)trace_rmdir_prefilter,
+ .rename2_prefilter = (void *)trace_rename2_prefilter,
+ .rename_prefilter = (void *)trace_rename_prefilter,
+ .unlink_prefilter = (void *)trace_unlink_prefilter,
+ .link_prefilter = (void *)trace_link_prefilter,
+ .getattr_prefilter = (void *)trace_getattr_prefilter,
+ .setattr_prefilter = (void *)trace_setattr_prefilter,
+ .statfs_prefilter = (void *)trace_statfs_prefilter,
+ .get_link_prefilter = (void *)trace_get_link_prefilter,
+ .symlink_prefilter = (void *)trace_symlink_prefilter,
+ .readdir_prefilter = (void *)trace_readdir_prefilter,
+ .access_prefilter = (void *)trace_access_prefilter,
+ .name = "trace_ops",
+};
+
diff --git a/tools/testing/selftests/filesystems/fuse/fuse_daemon.c b/tools/testing/selftests/filesystems/fuse/fuse_daemon.c
new file mode 100644
index 000000000000..42f9f770988b
--- /dev/null
+++ b/tools/testing/selftests/filesystems/fuse/fuse_daemon.c
@@ -0,0 +1,300 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright 2021 Google LLC
+ */
+
+#include "test_fuse.h"
+#include "test.skel.h"
+
+#include <errno.h>
+#include <fcntl.h>
+#include <stdlib.h>
+#include <string.h>
+#include <unistd.h>
+
+#include <sys/mount.h>
+#include <sys/stat.h>
+#include <sys/wait.h>
+
+#include <linux/unistd.h>
+
+#include <uapi/linux/fuse.h>
+#include <uapi/linux/bpf.h>
+
+bool user_messages;
+bool kernel_messages;
+
+static int display_trace(void)
+{
+ int pid = -1;
+ int tp = -1;
+ char c;
+ ssize_t bytes_read;
+ static char line[256] = {0};
+
+ if (!kernel_messages)
+ return TEST_SUCCESS;
+
+ TEST(pid = fork(), pid != -1);
+ if (pid != 0)
+ return pid;
+
+ TESTEQUAL(tracing_on(), 0);
+ TEST(tp = s_open(s_path(tracing_folder(), s("trace_pipe")),
+ O_RDONLY | O_CLOEXEC), tp != -1);
+ for (;;) {
+ TEST(bytes_read = read(tp, &c, sizeof(c)),
+ bytes_read == 1);
+ if (c == '\n') {
+ printf("%s\n", line);
+ line[0] = 0;
+ } else
+ sprintf(line + strlen(line), "%c", c);
+ }
+out:
+ if (pid == 0) {
+ close(tp);
+ exit(TEST_FAILURE);
+ }
+ return pid;
+}
+
+static const char *fuse_opcode_to_string(int opcode)
+{
+ switch (opcode & FUSE_OPCODE_FILTER) {
+ case FUSE_LOOKUP:
+ return "FUSE_LOOKUP";
+ case FUSE_FORGET:
+ return "FUSE_FORGET";
+ case FUSE_GETATTR:
+ return "FUSE_GETATTR";
+ case FUSE_SETATTR:
+ return "FUSE_SETATTR";
+ case FUSE_READLINK:
+ return "FUSE_READLINK";
+ case FUSE_SYMLINK:
+ return "FUSE_SYMLINK";
+ case FUSE_MKNOD:
+ return "FUSE_MKNOD";
+ case FUSE_MKDIR:
+ return "FUSE_MKDIR";
+ case FUSE_UNLINK:
+ return "FUSE_UNLINK";
+ case FUSE_RMDIR:
+ return "FUSE_RMDIR";
+ case FUSE_RENAME:
+ return "FUSE_RENAME";
+ case FUSE_LINK:
+ return "FUSE_LINK";
+ case FUSE_OPEN:
+ return "FUSE_OPEN";
+ case FUSE_READ:
+ return "FUSE_READ";
+ case FUSE_WRITE:
+ return "FUSE_WRITE";
+ case FUSE_STATFS:
+ return "FUSE_STATFS";
+ case FUSE_RELEASE:
+ return "FUSE_RELEASE";
+ case FUSE_FSYNC:
+ return "FUSE_FSYNC";
+ case FUSE_SETXATTR:
+ return "FUSE_SETXATTR";
+ case FUSE_GETXATTR:
+ return "FUSE_GETXATTR";
+ case FUSE_LISTXATTR:
+ return "FUSE_LISTXATTR";
+ case FUSE_REMOVEXATTR:
+ return "FUSE_REMOVEXATTR";
+ case FUSE_FLUSH:
+ return "FUSE_FLUSH";
+ case FUSE_INIT:
+ return "FUSE_INIT";
+ case FUSE_OPENDIR:
+ return "FUSE_OPENDIR";
+ case FUSE_READDIR:
+ return "FUSE_READDIR";
+ case FUSE_RELEASEDIR:
+ return "FUSE_RELEASEDIR";
+ case FUSE_FSYNCDIR:
+ return "FUSE_FSYNCDIR";
+ case FUSE_GETLK:
+ return "FUSE_GETLK";
+ case FUSE_SETLK:
+ return "FUSE_SETLK";
+ case FUSE_SETLKW:
+ return "FUSE_SETLKW";
+ case FUSE_ACCESS:
+ return "FUSE_ACCESS";
+ case FUSE_CREATE:
+ return "FUSE_CREATE";
+ case FUSE_INTERRUPT:
+ return "FUSE_INTERRUPT";
+ case FUSE_BMAP:
+ return "FUSE_BMAP";
+ case FUSE_DESTROY:
+ return "FUSE_DESTROY";
+ case FUSE_IOCTL:
+ return "FUSE_IOCTL";
+ case FUSE_POLL:
+ return "FUSE_POLL";
+ case FUSE_NOTIFY_REPLY:
+ return "FUSE_NOTIFY_REPLY";
+ case FUSE_BATCH_FORGET:
+ return "FUSE_BATCH_FORGET";
+ case FUSE_FALLOCATE:
+ return "FUSE_FALLOCATE";
+ case FUSE_READDIRPLUS:
+ return "FUSE_READDIRPLUS";
+ case FUSE_RENAME2:
+ return "FUSE_RENAME2";
+ case FUSE_LSEEK:
+ return "FUSE_LSEEK";
+ case FUSE_COPY_FILE_RANGE:
+ return "FUSE_COPY_FILE_RANGE";
+ case FUSE_SETUPMAPPING:
+ return "FUSE_SETUPMAPPING";
+ case FUSE_REMOVEMAPPING:
+ return "FUSE_REMOVEMAPPING";
+ //case FUSE_SYNCFS:
+ // return "FUSE_SYNCFS";
+ case CUSE_INIT:
+ return "CUSE_INIT";
+ case CUSE_INIT_BSWAP_RESERVED:
+ return "CUSE_INIT_BSWAP_RESERVED";
+ case FUSE_INIT_BSWAP_RESERVED:
+ return "FUSE_INIT_BSWAP_RESERVED";
+ }
+ return "?";
+}
+
+static int parse_options(int argc, char *const *argv)
+{
+ signed char c;
+
+ while ((c = getopt(argc, argv, "kuv")) != -1)
+ switch (c) {
+ case 'v':
+ test_options.verbose = true;
+ break;
+
+ case 'u':
+ user_messages = true;
+ break;
+
+ case 'k':
+ kernel_messages = true;
+ break;
+
+ default:
+ return -EINVAL;
+ }
+
+ return 0;
+}
+
+int main(int argc, char *argv[])
+{
+ struct test_bpf *test_skel = NULL;
+ struct bpf_link *test_link = NULL;
+ int result = TEST_FAILURE;
+ int trace_pid = -1;
+ char *mount_dir = NULL;
+ char *src_dir = NULL;
+ int src_fd = -1;
+ int fuse_dev = -1;
+ //struct map_relocation *map_relocations = NULL;
+ //size_t map_count = 0;
+ //int i;
+
+ if (geteuid() != 0)
+ ksft_print_msg("Not a root, might fail to mount.\n");
+ TESTEQUAL(parse_options(argc, argv), 0);
+
+ TEST(trace_pid = display_trace(), trace_pid != -1);
+
+ delete_dir_tree("fd-src", true);
+ TEST(src_dir = setup_mount_dir("fd-src"), src_dir);
+ delete_dir_tree("fd-dst", true);
+ TEST(mount_dir = setup_mount_dir("fd-dst"), mount_dir);
+
+ test_skel = test_bpf__open_and_load();
+ test_link = bpf_map__attach_struct_ops(test_skel->maps.trace_ops);
+
+ TEST(src_fd = open("fd-src", O_DIRECTORY | O_RDONLY | O_CLOEXEC),
+ src_fd != -1);
+ TESTSYSCALL(mkdirat(src_fd, "show", 0777));
+ TESTSYSCALL(mkdirat(src_fd, "hide", 0777));
+
+ /*for (i = 0; i < map_count; ++i)
+ if (!strcmp(map_relocations[i].name, "test_map")) {
+ uint32_t key = 23;
+ uint32_t value = 1234;
+ union bpf_attr attr = {
+ .map_fd = map_relocations[i].fd,
+ .key = ptr_to_u64(&key),
+ .value = ptr_to_u64(&value),
+ .flags = BPF_ANY,
+ };
+ TESTSYSCALL(syscall(__NR_bpf, BPF_MAP_UPDATE_ELEM,
+ &attr, sizeof(attr)));
+ }
+*/
+ TESTEQUAL(mount_fuse(mount_dir, "trace_ops", src_fd, &fuse_dev), 0);
+
+ if (fork())
+ return 0;
+
+ for (;;) {
+ uint8_t bytes_in[FUSE_MIN_READ_BUFFER];
+ uint8_t bytes_out[FUSE_MIN_READ_BUFFER] __attribute__((unused));
+ struct fuse_in_header *in_header =
+ (struct fuse_in_header *)bytes_in;
+ ssize_t res = read(fuse_dev, bytes_in, sizeof(bytes_in));
+
+ if (res == -1)
+ break;
+
+ switch (in_header->opcode) {
+ case FUSE_LOOKUP | FUSE_PREFILTER: {
+ char *name = (char *)(bytes_in + sizeof(*in_header));
+
+ if (user_messages)
+ printf("Lookup %s\n", name);
+ if (!strcmp(name, "hide"))
+ TESTFUSEOUTERROR(-ENOENT);
+ else {
+ printf("Lookup Prefilter response: %s\n", name);
+ TESTFUSEOUTREAD(name, strlen(name) + 1);
+ }
+ break;
+ }
+ default:
+ if (user_messages) {
+ printf("opcode is %d (%s)\n", in_header->opcode,
+ fuse_opcode_to_string(
+ in_header->opcode));
+ }
+ break;
+ }
+ }
+
+ result = TEST_SUCCESS;
+
+out:
+ /*for (i = 0; i < map_count; ++i) {
+ free(map_relocations[i].name);
+ close(map_relocations[i].fd);
+ }
+ free(map_relocations);*/
+ bpf_link__destroy(test_link);
+ test_bpf__destroy(test_skel);
+ umount2(mount_dir, MNT_FORCE);
+ delete_dir_tree(mount_dir, true);
+ free(mount_dir);
+ delete_dir_tree(src_dir, true);
+ free(src_dir);
+ if (trace_pid != -1)
+ kill(trace_pid, SIGKILL);
+ return result;
+}
diff --git a/tools/testing/selftests/filesystems/fuse/fuse_test.c b/tools/testing/selftests/filesystems/fuse/fuse_test.c
new file mode 100644
index 000000000000..cc14b79615c1
--- /dev/null
+++ b/tools/testing/selftests/filesystems/fuse/fuse_test.c
@@ -0,0 +1,2412 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright 2021 Google LLC
+ */
+#define _GNU_SOURCE
+
+#include "test_fuse.h"
+#include "test.skel.h"
+
+#include <errno.h>
+#include <fcntl.h>
+#include <stdlib.h>
+#include <string.h>
+#include <unistd.h>
+
+#include <sys/inotify.h>
+#include <sys/mman.h>
+#include <sys/mount.h>
+#include <sys/syscall.h>
+#include <sys/wait.h>
+
+#include <linux/capability.h>
+#include <linux/random.h>
+
+#include <uapi/linux/fuse.h>
+#include <uapi/linux/bpf.h>
+
+static const char *ft_src = "ft-src";
+static const char *ft_dst = "ft-dst";
+
+static void fill_buffer(uint8_t *data, size_t len, int file, int block)
+{
+ int i;
+ int seed = 7919 * file + block;
+
+ for (i = 0; i < len; i++) {
+ seed = 1103515245 * seed + 12345;
+ data[i] = (uint8_t)(seed >> (i % 13));
+ }
+}
+
+static bool test_buffer(uint8_t *data, size_t len, int file, int block)
+{
+ int i;
+ int seed = 7919 * file + block;
+
+ for (i = 0; i < len; i++) {
+ seed = 1103515245 * seed + 12345;
+ if (data[i] != (uint8_t)(seed >> (i % 13)))
+ return false;
+ }
+
+ return true;
+}
+
+static int create_file(int dir, struct s name, int index, size_t blocks)
+{
+ int result = TEST_FAILURE;
+ int fd = -1;
+ int i;
+ uint8_t data[PAGE_SIZE];
+
+ TEST(fd = s_openat(dir, name, O_CREAT | O_WRONLY, 0777), fd != -1);
+ for (i = 0; i < blocks; ++i) {
+ fill_buffer(data, PAGE_SIZE, index, i);
+ TESTEQUAL(write(fd, data, sizeof(data)), PAGE_SIZE);
+ }
+ TESTSYSCALL(close(fd));
+ result = TEST_SUCCESS;
+
+out:
+ close(fd);
+ return result;
+}
+
+static int bpf_clear_trace(void)
+{
+ int result = TEST_FAILURE;
+ int tp = -1;
+
+ TEST(tp = s_open(s_path(tracing_folder(), s("trace")),
+ O_WRONLY | O_TRUNC | O_CLOEXEC), tp != -1);
+
+ result = TEST_SUCCESS;
+out:
+ close(tp);
+ return result;
+}
+
+static int bpf_test_trace_maybe(const char *substr, bool present)
+{
+ int result = TEST_FAILURE;
+ int tp = -1;
+ char trace_buffer[4096] = {};
+ ssize_t bytes_read;
+
+ TEST(tp = s_open(s_path(tracing_folder(), s("trace_pipe")),
+ O_RDONLY | O_CLOEXEC),
+ tp != -1);
+ fcntl(tp, F_SETFL, O_NONBLOCK);
+
+ for (;;) {
+ bytes_read = read(tp, trace_buffer, sizeof(trace_buffer));
+ if (present)
+ TESTCOND(bytes_read > 0);
+ else if (bytes_read <= 0) {
+ result = TEST_SUCCESS;
+ break;
+ }
+
+ if (test_options.verbose)
+ ksft_print_msg("%s\n", trace_buffer);
+
+ if (strstr(trace_buffer, substr)) {
+ if (present)
+ result = TEST_SUCCESS;
+ break;
+ }
+ }
+out:
+ close(tp);
+ return result;
+}
+
+static int bpf_test_trace(const char *substr)
+{
+ return bpf_test_trace_maybe(substr, true);
+}
+
+static int bpf_test_no_trace(const char *substr)
+{
+ return bpf_test_trace_maybe(substr, false);
+}
+
+static int basic_test(const char *mount_dir)
+{
+ const char *test_name = "test";
+ const char *test_data = "data";
+
+ int result = TEST_FAILURE;
+ int fuse_dev = -1;
+ char *filename = NULL;
+ int fd = -1;
+ int pid = -1;
+ int status;
+
+ TESTEQUAL(mount_fuse(mount_dir, NULL, -1, &fuse_dev), 0);
+ FUSE_ACTION
+ char data[256];
+
+ filename = concat_file_name(mount_dir, test_name);
+ TESTERR(fd = open(filename, O_RDONLY | O_CLOEXEC), fd != -1);
+ TESTEQUAL(read(fd, data, strlen(test_data)), strlen(test_data));
+ TESTCOND(!strcmp(data, test_data));
+ TESTSYSCALL(close(fd));
+ fd = -1;
+ FUSE_DAEMON
+ DECL_FUSE_IN(open);
+ DECL_FUSE_IN(read);
+ DECL_FUSE_IN(flush);
+ DECL_FUSE_IN(release);
+
+ TESTFUSELOOKUP(test_name, 0);
+ TESTFUSEOUT1(fuse_entry_out, ((struct fuse_entry_out) {
+ .nodeid = 2,
+ .generation = 1,
+ .attr.ino = 100,
+ .attr.size = 4,
+ .attr.blksize = 512,
+ .attr.mode = S_IFREG | 0777,
+ }));
+
+ TESTFUSEIN(FUSE_OPEN, open_in);
+ TESTFUSEOUT1(fuse_open_out, ((struct fuse_open_out) {
+ .fh = 1,
+ .open_flags = open_in->flags,
+ }));
+
+ TESTFUSEIN(FUSE_READ, read_in);
+ TESTFUSEOUTREAD(test_data, strlen(test_data));
+
+ TESTFUSEIN(FUSE_FLUSH, flush_in);
+ TESTFUSEOUTEMPTY();
+
+ TESTFUSEIN(FUSE_RELEASE, release_in);
+ TESTFUSEOUTEMPTY();
+ FUSE_DONE
+
+ result = TEST_SUCCESS;
+out:
+ if (!pid)
+ exit(TEST_FAILURE);
+ close(fuse_dev);
+ close(fd);
+ free(filename);
+ umount(mount_dir);
+ return result;
+}
+
+static int bpf_test_real(const char *mount_dir)
+{
+ struct test_bpf *test_skel = NULL;
+ struct bpf_link *test_link = NULL;
+ const char *test_name = "real";
+ const char *test_data = "Weebles wobble but they don't fall down";
+ int result = TEST_FAILURE;
+ int src_fd = -1;
+ int fuse_dev = -1;
+ char *filename = NULL;
+ int fd = -1;
+ char read_buffer[256] = {};
+ ssize_t bytes_read;
+
+ TEST(src_fd = open(ft_src, O_DIRECTORY | O_RDONLY | O_CLOEXEC),
+ src_fd != -1);
+ TEST(fd = openat(src_fd, test_name, O_CREAT | O_RDWR | O_CLOEXEC, 0777),
+ fd != -1);
+ TESTEQUAL(write(fd, test_data, strlen(test_data)), strlen(test_data));
+ TESTSYSCALL(close(fd));
+ fd = -1;
+ TEST(test_skel = test_bpf__open_and_load(), test_skel != NULL);
+ TEST(test_link = bpf_map__attach_struct_ops(test_skel->maps.test_trace_ops), test_link != NULL);
+ TESTEQUAL(mount_fuse(mount_dir, "test_trace_ops", src_fd, &fuse_dev), 0);
+
+ filename = concat_file_name(mount_dir, test_name);
+ TESTERR(fd = open(filename, O_RDONLY | O_CLOEXEC), fd != -1);
+ bytes_read = read(fd, read_buffer, strlen(test_data));
+
+ TESTEQUAL(bytes_read, strlen(test_data));
+ TESTEQUAL(strcmp(test_data, read_buffer), 0);
+ TESTEQUAL(bpf_test_trace("read"), 0);
+ result = TEST_SUCCESS;
+out:
+ close(fuse_dev);
+ close(fd);
+ free(filename);
+ umount(mount_dir);
+ close(src_fd);
+ bpf_link__destroy(test_link);
+ test_bpf__destroy(test_skel);
+ return result;
+}
+
+
+static int bpf_test_partial(const char *mount_dir)
+{
+ struct test_bpf *test_skel = NULL;
+ struct bpf_link *test_link = NULL;
+ const char *test_name = "partial";
+ int result = TEST_FAILURE;
+ int src_fd = -1;
+ int fuse_dev = -1;
+ char *filename = NULL;
+ int fd = -1;
+ int pid = -1;
+ int status;
+
+ TEST(src_fd = open(ft_src, O_DIRECTORY | O_RDONLY | O_CLOEXEC),
+ src_fd != -1);
+ TESTEQUAL(create_file(src_fd, s(test_name), 1, 2), 0);
+ TEST(test_skel = test_bpf__open_and_load(), test_skel != NULL);
+ TEST(test_link = bpf_map__attach_struct_ops(test_skel->maps.test_trace_ops), test_link != NULL);
+ TESTEQUAL(mount_fuse(mount_dir, "test_trace_ops", src_fd, &fuse_dev), 0);
+
+ FUSE_ACTION
+ uint8_t data[PAGE_SIZE];
+
+ TEST(filename = concat_file_name(mount_dir, test_name),
+ filename);
+ TESTERR(fd = open(filename, O_RDONLY | O_CLOEXEC), fd != -1);
+ TESTEQUAL(read(fd, data, PAGE_SIZE), PAGE_SIZE);
+ //TESTEQUAL(bpf_test_trace("read"), 0);
+ TESTCOND(test_buffer(data, PAGE_SIZE, 2, 0));
+ TESTCOND(!test_buffer(data, PAGE_SIZE, 1, 0));
+ TESTEQUAL(read(fd, data, PAGE_SIZE), PAGE_SIZE);
+ TESTCOND(test_buffer(data, PAGE_SIZE, 1, 1));
+ TESTCOND(!test_buffer(data, PAGE_SIZE, 2, 1));
+ TESTSYSCALL(close(fd));
+ fd = -1;
+ FUSE_DAEMON
+ uint32_t *err_in;
+ DECL_FUSE(open);
+ DECL_FUSE(read);
+ DECL_FUSE(release);
+ uint8_t data[PAGE_SIZE];
+
+ TESTFUSEIN2_ERR_IN(FUSE_OPEN | FUSE_POSTFILTER, open_in, open_out, err_in);
+ TESTEQUAL(*err_in, 0);
+ TESTFUSEOUT1(fuse_open_out, ((struct fuse_open_out) {
+ .fh = 1,
+ .open_flags = open_in->flags,
+ }));
+
+ TESTFUSEIN(FUSE_READ, read_in);
+ fill_buffer(data, PAGE_SIZE, 2, 0);
+ TESTFUSEOUTREAD(data, PAGE_SIZE);
+
+ //TESTFUSEIN(FUSE_RELEASE, release_in);
+ //TESTFUSEOUTEMPTY();
+ FUSE_DONE
+
+ result = TEST_SUCCESS;
+out:
+ if (!pid)
+ exit(TEST_FAILURE);
+ close(fuse_dev);
+ close(fd);
+ free(filename);
+ umount(mount_dir);
+ close(src_fd);
+ bpf_link__destroy(test_link);
+ test_bpf__destroy(test_skel);
+ return result;
+}
+
+static int bpf_test_attrs(const char *mount_dir)
+{
+ struct test_bpf *test_skel = NULL;
+ struct bpf_link *test_link = NULL;
+ const char *test_name = "partial";
+ int result = TEST_FAILURE;
+ int src_fd = -1;
+ int fuse_dev = -1;
+ char *filename = NULL;
+ struct stat st;
+
+ TEST(src_fd = open(ft_src, O_DIRECTORY | O_RDONLY | O_CLOEXEC),
+ src_fd != -1);
+ TESTEQUAL(create_file(src_fd, s(test_name), 1, 2), 0);
+
+ TEST(test_skel = test_bpf__open_and_load(), test_skel != NULL);
+ TEST(test_link = bpf_map__attach_struct_ops(test_skel->maps.test_trace_ops), test_link != NULL);
+ TESTEQUAL(mount_fuse(mount_dir, "test_trace_ops", src_fd, &fuse_dev), 0);
+
+ TEST(filename = concat_file_name(mount_dir, test_name), filename);
+ TESTSYSCALL(stat(filename, &st));
+ TESTSYSCALL(chmod(filename, 0111));
+ TESTSYSCALL(stat(filename, &st));
+ TESTEQUAL(st.st_mode & 0777, 0111);
+ TESTSYSCALL(chmod(filename, 0777));
+ TESTSYSCALL(stat(filename, &st));
+ TESTEQUAL(st.st_mode & 0777, 0777);
+ TESTSYSCALL(chown(filename, 5, 6));
+ TESTSYSCALL(stat(filename, &st));
+ TESTEQUAL(st.st_uid, 5);
+ TESTEQUAL(st.st_gid, 6);
+
+ result = TEST_SUCCESS;
+out:
+ close(fuse_dev);
+ free(filename);
+ umount(mount_dir);
+ close(src_fd);
+ bpf_link__destroy(test_link);
+ test_bpf__destroy(test_skel);
+ return result;
+}
+
+static int bpf_test_readdir(const char *mount_dir)
+{
+ struct test_bpf *test_skel = NULL;
+ struct bpf_link *test_link = NULL;
+ const char *names[] = {"real", "partial", "fake", ".", ".."};
+ int result = TEST_FAILURE;
+ int src_fd = -1;
+ int fuse_dev = -1;
+ int pid = -1;
+ int status;
+ DIR *dir = NULL;
+ struct dirent *dirent;
+
+ TEST(src_fd = open(ft_src, O_DIRECTORY | O_RDONLY | O_CLOEXEC),
+ src_fd != -1);
+ TESTEQUAL(create_file(src_fd, s(names[0]), 1, 2), 0);
+ TESTEQUAL(create_file(src_fd, s(names[1]), 1, 2), 0);
+ TEST(test_skel = test_bpf__open_and_load(), test_skel != NULL);
+ TEST(test_link = bpf_map__attach_struct_ops(test_skel->maps.test_trace_ops), test_link != NULL);
+ TESTEQUAL(mount_fuse(mount_dir, "test_trace_ops", src_fd, &fuse_dev), 0);
+
+ FUSE_ACTION
+ int i, j;
+
+ TEST(dir = s_opendir(s(mount_dir)), dir);
+ TESTEQUAL(bpf_test_trace("opendir"), 0);
+
+ for (i = 0; i < ARRAY_SIZE(names); ++i) {
+ TEST(dirent = readdir(dir), dirent);
+
+ for (j = 0; j < ARRAY_SIZE(names); ++j)
+ if (names[j] &&
+ strcmp(names[j], dirent->d_name) == 0) {
+ names[j] = NULL;
+ break;
+ }
+ TESTNE(j, ARRAY_SIZE(names));
+ }
+ TEST(dirent = readdir(dir), dirent == NULL);
+ TESTSYSCALL(closedir(dir));
+ dir = NULL;
+ TESTEQUAL(bpf_test_trace("readdir"), 0);
+ FUSE_DAEMON
+ struct fuse_in_header *in_header =
+ (struct fuse_in_header *)bytes_in;
+ ssize_t res = read(fuse_dev, bytes_in, sizeof(bytes_in));
+ // ignore the error in extension
+ res -= ERR_IN_EXT_LEN;
+ struct fuse_read_out *read_out =
+ (struct fuse_read_out *) (bytes_in +
+ sizeof(*in_header) +
+ sizeof(struct fuse_read_in));
+ struct fuse_dirent *fuse_dirent =
+ (struct fuse_dirent *) (bytes_in + res);
+
+ TESTGE(res, sizeof(*in_header) + sizeof(struct fuse_read_in));
+ TESTEQUAL(in_header->opcode, FUSE_READDIR | FUSE_POSTFILTER);
+ *fuse_dirent = (struct fuse_dirent) {
+ .ino = 100,
+ .off = 5,
+ .namelen = strlen("fake"),
+ .type = DT_REG,
+ };
+ strcpy((char *)(bytes_in + res + sizeof(*fuse_dirent)), "fake");
+ res += FUSE_DIRENT_ALIGN(sizeof(*fuse_dirent) + strlen("fake") +
+ 1);
+ TESTFUSEDIROUTREAD(read_out,
+ bytes_in +
+ sizeof(struct fuse_in_header) +
+ sizeof(struct fuse_read_in) +
+ sizeof(struct fuse_read_out),
+ res - sizeof(struct fuse_in_header) -
+ sizeof(struct fuse_read_in) -
+ sizeof(struct fuse_read_out));
+ res = read(fuse_dev, bytes_in, sizeof(bytes_in));
+ TESTEQUAL(res, sizeof(*in_header) +
+ sizeof(struct fuse_read_in) +
+ sizeof(struct fuse_read_out) + ERR_IN_EXT_LEN);
+ TESTEQUAL(in_header->opcode, FUSE_READDIR | FUSE_POSTFILTER);
+ TESTFUSEDIROUTREAD(read_out, bytes_in, 0);
+ FUSE_DONE
+
+ result = TEST_SUCCESS;
+out:
+ closedir(dir);
+ close(fuse_dev);
+ umount(mount_dir);
+ close(src_fd);
+ bpf_link__destroy(test_link);
+ test_bpf__destroy(test_skel);
+ return result;
+}
+
+static int bpf_test_redact_readdir(const char *mount_dir)
+{
+ struct test_bpf *test_skel = NULL;
+ struct bpf_link *test_link = NULL;
+ const char *names[] = {"f1", "f2", "f3", "f4", "f5", "f6", ".", ".."};
+ int num_shown = (ARRAY_SIZE(names) - 2) / 2 + 2;
+ int result = TEST_FAILURE;
+ int src_fd = -1;
+ int fuse_dev = -1;
+ int pid = -1;
+ int status;
+ DIR *dir = NULL;
+ struct dirent *dirent;
+ int i;
+ int count = 0;
+
+ TEST(src_fd = open(ft_src, O_DIRECTORY | O_RDONLY | O_CLOEXEC),
+ src_fd != -1);
+ for (i = 0; i < ARRAY_SIZE(names) - 2; i++)
+ TESTEQUAL(create_file(src_fd, s(names[i]), 1, 2), 0);
+
+ TEST(test_skel = test_bpf__open_and_load(), test_skel != NULL);
+ TEST(test_link = bpf_map__attach_struct_ops(test_skel->maps.readdir_redact_ops), test_link != NULL);
+ TESTEQUAL(mount_fuse(mount_dir, "readdir_redact", src_fd, &fuse_dev), 0);
+
+ FUSE_ACTION
+ int j;
+
+ TEST(dir = s_opendir(s(mount_dir)), dir);
+ while ((dirent = readdir(dir))) {
+ errno = 0;
+ TESTEQUAL(errno, 0);
+ for (j = 0; j < ARRAY_SIZE(names); ++j)
+ if (names[j] &&
+ strcmp(names[j], dirent->d_name) == 0) {
+ names[j] = NULL;
+ count++;
+ break;
+ }
+ TESTNE(j, ARRAY_SIZE(names));
+ TESTGE(num_shown, count);
+ }
+ TESTEQUAL(count, num_shown);
+ TESTSYSCALL(closedir(dir));
+ dir = NULL;
+ FUSE_DAEMON
+ bool skip = true;
+ for (int i = 0; i < ARRAY_SIZE(names) + 1; i++) {
+ uint8_t bytes_in[FUSE_MIN_READ_BUFFER];
+ uint8_t bytes_out[FUSE_MIN_READ_BUFFER];
+ struct fuse_in_header *in_header =
+ (struct fuse_in_header *)bytes_in;
+ ssize_t res = read(fuse_dev, bytes_in, sizeof(bytes_in));
+ int length_out = 0;
+ uint8_t *pos;
+ uint8_t *dirs_in;
+ uint8_t *dirs_out;
+ struct fuse_read_in *fuse_read_in;
+ struct fuse_read_out *fuse_read_out_in;
+ struct fuse_read_out *fuse_read_out_out;
+ struct fuse_dirent *fuse_dirent_in = NULL;
+ struct fuse_dirent *next = NULL;
+ bool again = false;
+ int dir_ent_len = 0;
+
+ // We're ignoring the error_in extension
+ res -= ERR_IN_EXT_LEN;
+ TESTGE(res, sizeof(struct fuse_in_header) +
+ sizeof(struct fuse_read_in) +
+ sizeof(struct fuse_read_out));
+
+ pos = bytes_in + sizeof(struct fuse_in_header);
+ fuse_read_in = (struct fuse_read_in *) pos;
+ pos += sizeof(*fuse_read_in);
+ fuse_read_out_in = (struct fuse_read_out *) pos;
+ pos += sizeof(*fuse_read_out_in);
+ dirs_in = pos;
+
+ pos = bytes_out + sizeof(struct fuse_out_header);
+ fuse_read_out_out = (struct fuse_read_out *) pos;
+ pos += sizeof(*fuse_read_out_out);
+ dirs_out = pos;
+
+ if (dirs_in < bytes_in + res) {
+ bool is_dot;
+
+ fuse_dirent_in = (struct fuse_dirent *) dirs_in;
+ is_dot = (fuse_dirent_in->namelen == 1 &&
+ !strncmp(fuse_dirent_in->name, ".", 1)) ||
+ (fuse_dirent_in->namelen == 2 &&
+ !strncmp(fuse_dirent_in->name, "..", 2));
+
+ dir_ent_len = FUSE_DIRENT_ALIGN(
+ sizeof(*fuse_dirent_in) +
+ fuse_dirent_in->namelen);
+
+ if (dirs_in + dir_ent_len < bytes_in + res)
+ next = (struct fuse_dirent *)
+ (dirs_in + dir_ent_len);
+
+ if (!skip || is_dot) {
+ memcpy(dirs_out, fuse_dirent_in,
+ sizeof(struct fuse_dirent) +
+ fuse_dirent_in->namelen);
+ length_out += dir_ent_len;
+ }
+ again = ((skip && !is_dot) && next);
+
+ if (!is_dot)
+ skip = !skip;
+ }
+
+ fuse_read_out_out->offset = next ? next->off :
+ fuse_read_out_in->offset;
+ fuse_read_out_out->again = again;
+
+ {
+ struct fuse_out_header *out_header =
+ (struct fuse_out_header *)bytes_out;
+
+ *out_header = (struct fuse_out_header) {
+ .len = sizeof(*out_header) +
+ sizeof(*fuse_read_out_out) + length_out,
+ .unique = in_header->unique,
+ };
+ TESTEQUAL(write(fuse_dev, bytes_out, out_header->len),
+ out_header->len);
+ }
+ }
+ FUSE_DONE
+
+ result = TEST_SUCCESS;
+out:
+ closedir(dir);
+ close(fuse_dev);
+ umount(mount_dir);
+ close(src_fd);
+ bpf_link__destroy(test_link);
+ test_bpf__destroy(test_skel);
+ return result;
+}
+
+/*
+ * This test is more to show what classic fuse does with a creat in a subdir
+ * than a test of any new functionality
+ */
+static int bpf_test_creat(const char *mount_dir)
+{
+ const char *dir_name = "show";
+ const char *file_name = "file";
+ int result = TEST_FAILURE;
+ int fuse_dev = -1;
+ int pid = -1;
+ int status;
+ int fd = -1;
+
+ TESTEQUAL(mount_fuse(mount_dir, NULL, -1, &fuse_dev), 0);
+
+ FUSE_ACTION
+ TEST(fd = s_creat(s_path(s_path(s(mount_dir), s(dir_name)),
+ s(file_name)),
+ 0777),
+ fd != -1);
+ TESTSYSCALL(close(fd));
+ FUSE_DAEMON
+ DECL_FUSE_IN(create);
+ DECL_FUSE_IN(release);
+ DECL_FUSE_IN(flush);
+
+ TESTFUSELOOKUP(dir_name, 0);
+ TESTFUSEOUT1(fuse_entry_out, ((struct fuse_entry_out) {
+ .nodeid = 3,
+ .generation = 1,
+ .attr.ino = 100,
+ .attr.size = 4,
+ .attr.blksize = 512,
+ .attr.mode = S_IFDIR | 0777,
+ }));
+
+ TESTFUSELOOKUP(file_name, 0);
+ TESTFUSEOUTERROR(-ENOENT);
+
+ TESTFUSEINEXT(FUSE_CREATE, create_in, strlen(file_name) + 1);
+ TESTFUSEOUT2(fuse_entry_out, ((struct fuse_entry_out) {
+ .nodeid = 2,
+ .generation = 1,
+ .attr.ino = 200,
+ .attr.size = 4,
+ .attr.blksize = 512,
+ .attr.mode = S_IFREG,
+ }),
+ fuse_open_out, ((struct fuse_open_out) {
+ .fh = 1,
+ .open_flags = create_in->flags,
+ }));
+
+ TESTFUSEIN(FUSE_FLUSH, flush_in);
+ TESTFUSEOUTEMPTY();
+
+ TESTFUSEIN(FUSE_RELEASE, release_in);
+ TESTFUSEOUTEMPTY();
+ FUSE_DONE
+
+ result = TEST_SUCCESS;
+out:
+ close(fuse_dev);
+ umount(mount_dir);
+ return result;
+}
+
+static int bpf_test_hidden_entries(const char *mount_dir)
+{
+ struct test_bpf *test_skel = NULL;
+ struct bpf_link *test_link = NULL;
+ static const char * const dir_names[] = {
+ "show",
+ "hide",
+ };
+ const char *file_name = "file";
+ const char *data = "The quick brown fox jumps over the lazy dog\n";
+ int result = TEST_FAILURE;
+ int src_fd = -1;
+ int fuse_dev = -1;
+ int fd = -1;
+
+ TEST(src_fd = open(ft_src, O_DIRECTORY | O_RDONLY | O_CLOEXEC),
+ src_fd != -1);
+ TESTSYSCALL(mkdirat(src_fd, dir_names[0], 0777));
+ TESTSYSCALL(mkdirat(src_fd, dir_names[1], 0777));
+ TEST(test_skel = test_bpf__open_and_load(), test_skel != NULL);
+ TEST(test_link = bpf_map__attach_struct_ops(test_skel->maps.test_hidden_ops), test_link != NULL);
+ TESTEQUAL(mount_fuse(mount_dir, "test_hidden", src_fd, &fuse_dev), 0);
+
+ TEST(fd = s_creat(s_path(s_path(s(mount_dir), s(dir_names[0])),
+ s(file_name)),
+ 0777),
+ fd != -1);
+ TESTSYSCALL(fallocate(fd, 0, 0, 4096));
+ TEST(write(fd, data, strlen(data)), strlen(data));
+ TESTSYSCALL(close(fd));
+ TESTEQUAL(bpf_test_trace("Create"), 0);
+
+ result = TEST_SUCCESS;
+out:
+ close(fuse_dev);
+ umount(mount_dir);
+ close(src_fd);
+ bpf_link__destroy(test_link);
+ test_bpf__destroy(test_skel);
+ return result;
+}
+
+static int bpf_test_dir(const char *mount_dir)
+{
+ struct test_bpf *test_skel = NULL;
+ struct bpf_link *test_link = NULL;
+ const char *dir_name = "dir";
+ int result = TEST_FAILURE;
+ int src_fd = -1;
+ int fuse_dev = -1;
+ struct stat st;
+
+ TEST(src_fd = open(ft_src, O_DIRECTORY | O_RDONLY | O_CLOEXEC),
+ src_fd != -1);
+ TEST(test_skel = test_bpf__open_and_load(), test_skel != NULL);
+ TEST(test_link = bpf_map__attach_struct_ops(test_skel->maps.test_trace_ops), test_link != NULL);
+ TESTEQUAL(mount_fuse(mount_dir, "test_trace_ops", src_fd, &fuse_dev), 0);
+
+ TESTSYSCALL(s_mkdir(s_path(s(mount_dir), s(dir_name)), 0777));
+ TESTEQUAL(bpf_test_trace("mkdir"), 0);
+ TESTSYSCALL(s_stat(s_path(s(ft_src), s(dir_name)), &st));
+ TESTSYSCALL(s_rmdir(s_path(s(mount_dir), s(dir_name))));
+ TESTEQUAL(s_stat(s_path(s(ft_src), s(dir_name)), &st), -1);
+ TESTEQUAL(errno, ENOENT);
+ result = TEST_SUCCESS;
+out:
+ close(fuse_dev);
+ umount(mount_dir);
+ close(src_fd);
+ bpf_link__destroy(test_link);
+ test_bpf__destroy(test_skel);
+ return result;
+}
+
+static int bpf_test_file(const char *mount_dir, bool close_first)
+{
+ struct test_bpf *test_skel = NULL;
+ struct bpf_link *test_link = NULL;
+ const char *file_name = "real";
+ int result = TEST_FAILURE;
+ int src_fd = -1;
+ int fuse_dev = -1;
+ int fd = -1;
+ struct stat st;
+
+ TEST(src_fd = open(ft_src, O_DIRECTORY | O_RDONLY | O_CLOEXEC),
+ src_fd != -1);
+ TEST(test_skel = test_bpf__open_and_load(), test_skel != NULL);
+ TEST(test_link = bpf_map__attach_struct_ops(test_skel->maps.test_trace_ops), test_link != NULL);
+ TESTEQUAL(mount_fuse(mount_dir, "test_trace_ops", src_fd, &fuse_dev), 0);
+
+ TEST(fd = s_creat(s_path(s(mount_dir), s(file_name)),
+ 0777),
+ fd != -1);
+ TESTEQUAL(bpf_test_trace("Create"), 0);
+ if (close_first) {
+ TESTSYSCALL(close(fd));
+ fd = -1;
+ }
+ TESTSYSCALL(s_stat(s_path(s(ft_src), s(file_name)), &st));
+ TESTSYSCALL(s_unlink(s_path(s(mount_dir), s(file_name))));
+ TESTEQUAL(bpf_test_trace("unlink"), 0);
+ TESTEQUAL(s_stat(s_path(s(ft_src), s(file_name)), &st), -1);
+ TESTEQUAL(errno, ENOENT);
+ if (!close_first) {
+ TESTSYSCALL(close(fd));
+ fd = -1;
+ }
+ result = TEST_SUCCESS;
+out:
+ close(fd);
+ close(fuse_dev);
+ umount(mount_dir);
+ close(src_fd);
+ bpf_link__destroy(test_link);
+ test_bpf__destroy(test_skel);
+ return result;
+}
+
+static int bpf_test_file_early_close(const char *mount_dir)
+{
+ return bpf_test_file(mount_dir, true);
+}
+
+static int bpf_test_file_late_close(const char *mount_dir)
+{
+ return bpf_test_file(mount_dir, false);
+}
+
+static int bpf_test_alter_errcode_bpf(const char *mount_dir)
+{
+ struct test_bpf *test_skel = NULL;
+ struct bpf_link *test_link = NULL;
+ const char *dir_name = "dir";
+ int result = TEST_FAILURE;
+ int src_fd = -1;
+ int fuse_dev = -1;
+ struct stat st;
+
+ TEST(src_fd = open(ft_src, O_DIRECTORY | O_RDONLY | O_CLOEXEC),
+ src_fd != -1);
+ TEST(test_skel = test_bpf__open_and_load(), test_skel != NULL);
+ TEST(test_link = bpf_map__attach_struct_ops(test_skel->maps.test_error_ops), test_link != NULL);
+ TESTEQUAL(mount_fuse(mount_dir, "test_error", src_fd, &fuse_dev), 0);
+
+ TESTSYSCALL(s_mkdir(s_path(s(mount_dir), s(dir_name)), 0777));
+ //TESTEQUAL(bpf_test_trace("mkdir"), 0);
+ TESTSYSCALL(s_stat(s_path(s(ft_src), s(dir_name)), &st));
+ TESTEQUAL(s_mkdir(s_path(s(mount_dir), s(dir_name)), 0777), -EPERM);
+ TESTSYSCALL(s_rmdir(s_path(s(mount_dir), s(dir_name))));
+ TESTEQUAL(s_stat(s_path(s(ft_src), s(dir_name)), &st), -1);
+ TESTEQUAL(errno, ENOENT);
+ result = TEST_SUCCESS;
+out:
+ close(fuse_dev);
+ umount(mount_dir);
+ close(src_fd);
+ bpf_link__destroy(test_link);
+ test_bpf__destroy(test_skel);
+ return result;
+}
+
+static int bpf_test_alter_errcode_userspace(const char *mount_dir)
+{
+ struct test_bpf *test_skel = NULL;
+ struct bpf_link *test_link = NULL;
+ const char *dir_name = "doesnotexist";
+ int result = TEST_FAILURE;
+ int src_fd = -1;
+ int fuse_dev = -1;
+ int pid = -1;
+ int status;
+
+ TEST(src_fd = open(ft_src, O_DIRECTORY | O_RDONLY | O_CLOEXEC),
+ src_fd != -1);
+ TEST(test_skel = test_bpf__open_and_load(), test_skel != NULL);
+ TEST(test_link = bpf_map__attach_struct_ops(test_skel->maps.test_error_ops), test_link != NULL);
+ TESTEQUAL(mount_fuse(mount_dir, "test_error", src_fd, &fuse_dev), 0);
+
+ FUSE_ACTION
+ TESTEQUAL(s_unlink(s_path(s(mount_dir), s(dir_name))),
+ -1);
+ TESTEQUAL(errno, ENOMEM);
+ FUSE_DAEMON
+ TESTFUSELOOKUP("doesnotexist", FUSE_POSTFILTER);
+ TESTFUSEOUTERROR(-ENOMEM);
+ FUSE_DONE
+ result = TEST_SUCCESS;
+out:
+ close(fuse_dev);
+ umount(mount_dir);
+ close(src_fd);
+ bpf_link__destroy(test_link);
+ test_bpf__destroy(test_skel);
+ return result;
+}
+//TODO: Make equivalent struct_op tests
+#if 0
+static int bpf_test_verifier(const char *mount_dir)
+{
+ int result = TEST_FAILURE;
+ int bpf_fd1 = -1;
+ int bpf_fd2 = -1;
+ int bpf_fd3 = -1;
+
+ TESTEQUAL(install_elf_bpf("test_bpf.bpf.o", "test_verify",
+ &bpf_fd1, NULL, NULL), 0);
+ TESTEQUAL(install_elf_bpf_invalid("test_bpf.bpf.o", "test_verify_fail",
+ &bpf_fd2, NULL, NULL), 0);
+ TESTEQUAL(install_elf_bpf_invalid("test_bpf.bpf.o", "test_verify_fail2",
+ &bpf_fd3, NULL, NULL), 0);
+ result = TEST_SUCCESS;
+out:
+ close(bpf_fd1);
+ close(bpf_fd2);
+ close(bpf_fd3);
+ return result;
+}
+
+static int bpf_test_verifier_out_args(const char *mount_dir)
+{
+ int result = TEST_FAILURE;
+ int bpf_fd1 = -1;
+ int bpf_fd2 = -1;
+
+ TESTEQUAL(install_elf_bpf_invalid("test_bpf.bpf.o", "test_verify_fail3",
+ &bpf_fd1, NULL, NULL), 0);
+ TESTEQUAL(install_elf_bpf_invalid("test_bpf.bpf.o", "test_verify_fail4",
+ &bpf_fd2, NULL, NULL), 0);
+ result = TEST_SUCCESS;
+out:
+ close(bpf_fd1);
+ close(bpf_fd2);
+ return result;
+}
+
+static int bpf_test_verifier_packet_invalidation(const char *mount_dir)
+{
+ int result = TEST_FAILURE;
+ int bpf_fd1 = -1;
+ int bpf_fd2 = -1;
+
+ TESTEQUAL(install_elf_bpf_invalid("test_bpf.bpf.o", "test_verify_fail5",
+ &bpf_fd1, NULL, NULL), 0);
+ TESTEQUAL(install_elf_bpf("test_bpf.bpf.o", "test_verify5",
+ &bpf_fd2, NULL, NULL), 0);
+ result = TEST_SUCCESS;
+out:
+ close(bpf_fd1);
+ close(bpf_fd2);
+ return result;
+}
+
+static int bpf_test_verifier_nonsense_read(const char *mount_dir)
+{
+ int result = TEST_FAILURE;
+ int bpf_fd1 = -1;
+
+ TESTEQUAL(install_elf_bpf_invalid("test_bpf.bpf.o", "test_verify_fail6",
+ &bpf_fd1, NULL, NULL), 0);
+ result = TEST_SUCCESS;
+out:
+ close(bpf_fd1);
+ return result;
+}
+#endif
+
+static int bpf_test_mknod(const char *mount_dir)
+{
+ struct test_bpf *test_skel = NULL;
+ struct bpf_link *test_link = NULL;
+ const char *file_name = "real";
+ int result = TEST_FAILURE;
+ int src_fd = -1;
+ int bpf_fd = -1;
+ int fuse_dev = -1;
+ struct stat st;
+
+ TEST(src_fd = open(ft_src, O_DIRECTORY | O_RDONLY | O_CLOEXEC),
+ src_fd != -1);
+ TEST(test_skel = test_bpf__open_and_load(), test_skel != NULL);
+ TEST(test_link = bpf_map__attach_struct_ops(test_skel->maps.test_trace_ops), test_link != NULL);
+ TESTEQUAL(mount_fuse(mount_dir, "test_trace_ops", src_fd, &fuse_dev), 0);
+
+ TESTSYSCALL(s_mkfifo(s_path(s(mount_dir), s(file_name)), 0777));
+ TESTEQUAL(bpf_test_trace("mknod"), 0);
+ TESTSYSCALL(s_stat(s_path(s(ft_src), s(file_name)), &st));
+ TESTSYSCALL(s_unlink(s_path(s(mount_dir), s(file_name))));
+ TESTEQUAL(bpf_test_trace("unlink"), 0);
+ TESTEQUAL(s_stat(s_path(s(ft_src), s(file_name)), &st), -1);
+ TESTEQUAL(errno, ENOENT);
+ result = TEST_SUCCESS;
+out:
+ close(fuse_dev);
+ umount(mount_dir);
+ close(bpf_fd);
+ close(src_fd);
+ bpf_link__destroy(test_link);
+ test_bpf__destroy(test_skel);
+ return result;
+}
+
+static int bpf_test_largedir(const char *mount_dir)
+{
+ struct test_bpf *test_skel = NULL;
+ struct bpf_link *test_link = NULL;
+ const char *show = "show";
+ const int files = 1000;
+
+ int result = TEST_FAILURE;
+ int src_fd = -1;
+ int bpf_fd = -1;
+ int fuse_dev = -1;
+ int pid = -1;
+ int status;
+
+ TEST(src_fd = open(ft_src, O_DIRECTORY | O_RDONLY | O_CLOEXEC),
+ src_fd != -1);
+ TEST(test_skel = test_bpf__open_and_load(), test_skel != NULL);
+ TEST(test_link = bpf_map__attach_struct_ops(test_skel->maps.trace_ops), test_link != NULL);
+ TESTEQUAL(mount_fuse(mount_dir, "trace_ops", src_fd, &fuse_dev), 0);
+
+ FUSE_ACTION
+ int i;
+ int fd;
+ DIR *dir = NULL;
+ struct dirent *dirent;
+
+ TESTSYSCALL(s_mkdir(s_path(s(mount_dir), s(show)), 0777));
+ for (i = 0; i < files; ++i) {
+ char filename[NAME_MAX];
+
+ sprintf(filename, "%d", i);
+ TEST(fd = s_creat(s_path(s_path(s(mount_dir), s(show)),
+ s(filename)), 0777), fd != -1);
+ TESTSYSCALL(close(fd));
+ }
+
+ TEST(dir = s_opendir(s_path(s(mount_dir), s(show))), dir);
+ for (dirent = readdir(dir); dirent; dirent = readdir(dir))
+ ;
+ closedir(dir);
+ FUSE_DAEMON
+ int i;
+
+ for (i = 0; i < files + 2; ++i) {
+ TESTFUSELOOKUP(show, FUSE_PREFILTER);
+ TESTFUSEOUTREAD(show, 5);
+ }
+ FUSE_DONE
+
+ result = TEST_SUCCESS;
+out:
+ close(fuse_dev);
+ umount(mount_dir);
+ close(bpf_fd);
+ close(src_fd);
+ bpf_link__destroy(test_link);
+ test_bpf__destroy(test_skel);
+ return result;
+}
+
+static int bpf_test_link(const char *mount_dir)
+{
+ struct test_bpf *test_skel = NULL;
+ struct bpf_link *test_link = NULL;
+ const char *file_name = "real";
+ const char *link_name = "partial";
+ int result = TEST_FAILURE;
+ int fd = -1;
+ int src_fd = -1;
+ int bpf_fd = -1;
+ int fuse_dev = -1;
+ struct stat st;
+
+ TEST(src_fd = open(ft_src, O_DIRECTORY | O_RDONLY | O_CLOEXEC),
+ src_fd != -1);
+ TEST(test_skel = test_bpf__open_and_load(), test_skel != NULL);
+ TEST(test_link = bpf_map__attach_struct_ops(test_skel->maps.test_trace_ops), test_link != NULL);
+ TESTEQUAL(mount_fuse(mount_dir, "test_trace_ops", src_fd, &fuse_dev), 0);
+
+ TEST(fd = s_creat(s_path(s(mount_dir), s(file_name)), 0777), fd != -1);
+ TESTEQUAL(bpf_test_trace("Create"), 0);
+ TESTSYSCALL(s_stat(s_path(s(ft_src), s(file_name)), &st));
+
+ TESTSYSCALL(s_link(s_path(s(mount_dir), s(file_name)),
+ s_path(s(mount_dir), s(link_name))));
+
+ TESTEQUAL(bpf_test_trace("link"), 0);
+ TESTSYSCALL(s_stat(s_path(s(ft_src), s(link_name)), &st));
+
+ TESTSYSCALL(s_unlink(s_path(s(mount_dir), s(link_name))));
+ TESTEQUAL(bpf_test_trace("unlink"), 0);
+ TESTEQUAL(s_stat(s_path(s(ft_src), s(link_name)), &st), -1);
+ TESTEQUAL(errno, ENOENT);
+
+ TESTSYSCALL(s_unlink(s_path(s(mount_dir), s(file_name))));
+ TESTEQUAL(bpf_test_trace("unlink"), 0);
+ TESTEQUAL(s_stat(s_path(s(ft_src), s(file_name)), &st), -1);
+ TESTEQUAL(errno, ENOENT);
+
+ result = TEST_SUCCESS;
+out:
+ close(fd);
+ close(fuse_dev);
+ umount(mount_dir);
+ close(bpf_fd);
+ close(src_fd);
+ bpf_link__destroy(test_link);
+ test_bpf__destroy(test_skel);
+ return result;
+}
+
+static int bpf_test_symlink(const char *mount_dir)
+{
+ struct test_bpf *test_skel = NULL;
+ struct bpf_link *test_link = NULL;
+ const char *test_name = "real";
+ const char *symlink_name = "partial";
+ const char *test_data = "Weebles wobble but they don't fall down";
+ int result = TEST_FAILURE;
+ int bpf_fd = -1;
+ int src_fd = -1;
+ int fuse_dev = -1;
+ int fd = -1;
+ char read_buffer[256] = {};
+ ssize_t bytes_read;
+
+ TEST(src_fd = open(ft_src, O_DIRECTORY | O_RDONLY | O_CLOEXEC),
+ src_fd != -1);
+ TEST(fd = openat(src_fd, test_name, O_CREAT | O_RDWR | O_CLOEXEC, 0777),
+ fd != -1);
+ TESTEQUAL(write(fd, test_data, strlen(test_data)), strlen(test_data));
+ TESTSYSCALL(close(fd));
+ fd = -1;
+
+ TEST(test_skel = test_bpf__open_and_load(), test_skel != NULL);
+ TEST(test_link = bpf_map__attach_struct_ops(test_skel->maps.test_trace_ops), test_link != NULL);
+ TESTEQUAL(mount_fuse(mount_dir, "test_trace_ops", src_fd, &fuse_dev), 0);
+
+ TESTSYSCALL(s_symlink(s_path(s(mount_dir), s(test_name)),
+ s_path(s(mount_dir), s(symlink_name))));
+ TESTEQUAL(bpf_test_trace("symlink"), 0);
+
+ TESTERR(fd = s_open(s_path(s(mount_dir), s(symlink_name)), O_RDONLY | O_CLOEXEC), fd != -1);
+ bytes_read = read(fd, read_buffer, strlen(test_data));
+ TESTEQUAL(bpf_test_trace("readlink"), 0);
+ TESTEQUAL(bytes_read, strlen(test_data));
+ TESTEQUAL(strcmp(test_data, read_buffer), 0);
+
+ result = TEST_SUCCESS;
+out:
+ close(fuse_dev);
+ close(fd);
+ umount(mount_dir);
+ close(src_fd);
+ close(bpf_fd);
+ bpf_link__destroy(test_link);
+ test_bpf__destroy(test_skel);
+ return result;
+}
+
+static int bpf_test_xattr(const char *mount_dir)
+{
+ struct test_bpf *test_skel = NULL;
+ struct bpf_link *test_link = NULL;
+ static const char file_name[] = "real";
+ static const char xattr_name[] = "user.xattr_test_name";
+ static const char xattr_value[] = "this_is_a_test";
+ const size_t xattr_size = sizeof(xattr_value);
+ char xattr_value_ret[256];
+ ssize_t xattr_size_ret;
+ int result = TEST_FAILURE;
+ int fd = -1;
+ int src_fd = -1;
+ int bpf_fd = -1;
+ int fuse_dev = -1;
+ struct stat st;
+
+ memset(xattr_value_ret, '\0', sizeof(xattr_value_ret));
+
+ TEST(src_fd = open(ft_src, O_DIRECTORY | O_RDONLY | O_CLOEXEC),
+ src_fd != -1);
+ TEST(test_skel = test_bpf__open_and_load(), test_skel != NULL);
+ TEST(test_link = bpf_map__attach_struct_ops(test_skel->maps.test_trace_ops), test_link != NULL);
+ TESTEQUAL(mount_fuse(mount_dir, "test_trace_ops", src_fd, &fuse_dev), 0);
+
+
+ TEST(fd = s_creat(s_path(s(mount_dir), s(file_name)), 0777), fd != -1);
+ TESTEQUAL(bpf_test_trace("Create"), 0);
+ TESTSYSCALL(close(fd));
+
+ TESTSYSCALL(s_stat(s_path(s(ft_src), s(file_name)), &st));
+ TEST(result = s_getxattr(s_path(s(mount_dir), s(file_name)), xattr_name,
+ xattr_value_ret, sizeof(xattr_value_ret),
+ &xattr_size_ret),
+ result == -1);
+ TESTEQUAL(errno, ENODATA);
+ TESTEQUAL(bpf_test_trace("getxattr"), 0);
+
+ TESTSYSCALL(s_listxattr(s_path(s(mount_dir), s(file_name)),
+ xattr_value_ret, sizeof(xattr_value_ret),
+ &xattr_size_ret));
+ TESTEQUAL(bpf_test_trace("listxattr"), 0);
+ TESTEQUAL(xattr_size_ret, 0);
+
+ TESTSYSCALL(s_setxattr(s_path(s(mount_dir), s(file_name)), xattr_name,
+ xattr_value, xattr_size, 0));
+ TESTEQUAL(bpf_test_trace("setxattr"), 0);
+
+ TESTSYSCALL(s_listxattr(s_path(s(mount_dir), s(file_name)),
+ xattr_value_ret, sizeof(xattr_value_ret),
+ &xattr_size_ret));
+ TESTEQUAL(bpf_test_trace("listxattr"), 0);
+ TESTEQUAL(xattr_size_ret, sizeof(xattr_name));
+ TESTEQUAL(strcmp(xattr_name, xattr_value_ret), 0);
+
+ TESTSYSCALL(s_getxattr(s_path(s(mount_dir), s(file_name)), xattr_name,
+ xattr_value_ret, sizeof(xattr_value_ret),
+ &xattr_size_ret));
+ TESTEQUAL(bpf_test_trace("getxattr"), 0);
+ TESTEQUAL(xattr_size, xattr_size_ret);
+ TESTEQUAL(strcmp(xattr_value, xattr_value_ret), 0);
+
+ TESTSYSCALL(s_removexattr(s_path(s(mount_dir), s(file_name)), xattr_name));
+ TESTEQUAL(bpf_test_trace("removexattr"), 0);
+
+ TESTEQUAL(s_getxattr(s_path(s(mount_dir), s(file_name)), xattr_name,
+ xattr_value_ret, sizeof(xattr_value_ret),
+ &xattr_size_ret), -1);
+ TESTEQUAL(errno, ENODATA);
+
+ TESTSYSCALL(s_unlink(s_path(s(mount_dir), s(file_name))));
+ TESTEQUAL(bpf_test_trace("unlink"), 0);
+ TESTEQUAL(s_stat(s_path(s(ft_src), s(file_name)), &st), -1);
+ TESTEQUAL(errno, ENOENT);
+
+ result = TEST_SUCCESS;
+out:
+ close(fuse_dev);
+ umount(mount_dir);
+ close(bpf_fd);
+ close(src_fd);
+ bpf_link__destroy(test_link);
+ test_bpf__destroy(test_skel);
+ return result;
+}
+
+static int bpf_test_set_backing(const char *mount_dir)
+{
+ struct test_bpf *test_skel = NULL;
+ struct bpf_link *test_link = NULL;
+ const char *backing_name = "backing";
+ const char *test_data = "data";
+ const char *test_name = "test";
+
+ int result = TEST_FAILURE;
+ int fuse_dev = -1;
+ int fd = -1;
+ int pid = -1;
+ int status;
+
+ TEST(test_skel = test_bpf__open_and_load(), test_skel != NULL);
+ TEST(test_link = bpf_map__attach_struct_ops(test_skel->maps.trace_ops), test_link != NULL);
+ TESTEQUAL(mount_fuse_no_init(mount_dir, NULL, -1, &fuse_dev), 0);
+ FUSE_ACTION
+ char data[256] = {0};
+
+ TESTERR(fd = s_open(s_path(s(mount_dir), s(test_name)),
+ O_RDONLY | O_CLOEXEC), fd != -1);
+ TESTEQUAL(read(fd, data, strlen(test_data)), strlen(test_data));
+ TESTCOND(!strcmp(data, test_data));
+ TESTSYSCALL(close(fd));
+ fd = -1;
+ TESTSYSCALL(umount(mount_dir));
+ FUSE_DAEMON
+ //int bpf_fd = -1;
+ int backing_fd = -1;
+ struct fuse_bpf_entry_out bpf_entry[2];
+
+ TESTERR(backing_fd = s_creat(s_path(s(ft_src), s(backing_name)), 0777),
+ backing_fd != -1);
+ TESTEQUAL(write(backing_fd, test_data, strlen(test_data)),
+ strlen(test_data));
+
+ TESTFUSEINIT();
+ TESTFUSELOOKUP(test_name, 0);
+ bpf_entry[0] = (struct fuse_bpf_entry_out) {
+ .entry_type = FUSE_ENTRY_BPF,
+ .name = "trace_ops",
+ };
+ bpf_entry[1] = (struct fuse_bpf_entry_out) {
+ .entry_type = FUSE_ENTRY_BACKING,
+ .fd = backing_fd,
+ };
+ TESTFUSEOUT3_IOCTL(fuse_entry_out, ((struct fuse_entry_out) {0}),
+ fuse_bpf_entry_out, bpf_entry[0],
+ fuse_bpf_entry_out, bpf_entry[1]);
+ read(fuse_dev, bytes_in, sizeof(bytes_in));
+ //TESTSYSCALL(close(bpf_fd));
+ TESTSYSCALL(close(backing_fd));
+ FUSE_DONE
+
+ result = TEST_SUCCESS;
+out:
+ if (!pid)
+ exit(TEST_FAILURE);
+ close(fuse_dev);
+ close(fd);
+ umount(mount_dir);
+ bpf_link__destroy(test_link);
+ test_bpf__destroy(test_skel);
+ return result;
+}
+
+static int bpf_test_set_backing_no_ioctl(const char *mount_dir)
+{
+ struct test_bpf *test_skel = NULL;
+ struct bpf_link *test_link = NULL;
+ const char *backing_name = "backing";
+ const char *test_name = "test";
+
+ int result = TEST_FAILURE;
+ int fuse_dev = -1;
+ int fd = -1;
+ int pid = -1;
+ int status;
+
+ TEST(test_skel = test_bpf__open_and_load(), test_skel != NULL);
+ TEST(test_link = bpf_map__attach_struct_ops(test_skel->maps.trace_ops), test_link != NULL);
+ TESTEQUAL(mount_fuse_no_init(mount_dir, NULL, -1, &fuse_dev), 0);
+ FUSE_ACTION
+
+ TESTERR(fd = s_open(s_path(s(mount_dir), s(test_name)),
+ O_RDONLY | O_CLOEXEC), fd == -1);
+ FUSE_DAEMON
+ int backing_fd = -1;
+ struct fuse_bpf_entry_out bpf_entry[2];
+
+ TESTERR(backing_fd = s_creat(s_path(s(ft_src), s(backing_name)), 0777),
+ backing_fd != -1);
+
+ TESTFUSEINIT();
+ TESTFUSELOOKUP(test_name, 0);
+ bpf_entry[0] = (struct fuse_bpf_entry_out) {
+ .entry_type = FUSE_ENTRY_BPF,
+ .name = "trace_ops",
+ };
+ bpf_entry[1] = (struct fuse_bpf_entry_out) {
+ .entry_type = FUSE_ENTRY_BACKING,
+ .fd = backing_fd,
+ };
+ TESTFUSEOUT3_FAIL(fuse_entry_out, ((struct fuse_entry_out) {0}),
+ fuse_bpf_entry_out, bpf_entry[0],
+ fuse_bpf_entry_out, bpf_entry[1]);
+ TESTSYSCALL(close(backing_fd));
+ FUSE_DONE
+
+ result = TEST_SUCCESS;
+out:
+ if (!pid)
+ exit(TEST_FAILURE);
+ close(fuse_dev);
+ close(fd);
+ umount(mount_dir);
+ bpf_link__destroy(test_link);
+ test_bpf__destroy(test_skel);
+ return result;
+}
+
+static int bpf_test_set_backing_folder(const char *mount_dir)
+{
+ struct test_bpf *test_skel = NULL;
+ struct bpf_link *test_link = NULL;
+ const char *backing_name = "backingdir";
+ const char *test_name = "testdir";
+ const char *names[] = {"file", ".", ".."};
+
+ int result = TEST_FAILURE;
+ int fuse_dev = -1;
+ int fd = -1;
+ int pid = -1;
+ int status;
+ TEST(test_skel = test_bpf__open_and_load(), test_skel != NULL);
+ TEST(test_link = bpf_map__attach_struct_ops(test_skel->maps.trace_ops), test_link != NULL);
+ TESTEQUAL(mount_fuse_no_init(mount_dir, NULL, -1, &fuse_dev), 0);
+ FUSE_ACTION
+ DIR *dir = NULL;
+ struct dirent *dirent;
+ int i, j;
+
+ TEST(dir = s_opendir(s_path(s(mount_dir), s(test_name))), dir);
+
+ for (i = 0; i < ARRAY_SIZE(names); ++i) {
+ TEST(dirent = readdir(dir), dirent);
+
+ for (j = 0; j < ARRAY_SIZE(names); ++j)
+ if (names[j] &&
+ strcmp(names[j], dirent->d_name) == 0) {
+ names[j] = NULL;
+ break;
+ }
+ TESTNE(j, ARRAY_SIZE(names));
+ }
+ TEST(dirent = readdir(dir), dirent == NULL);
+ TESTSYSCALL(closedir(dir));
+ dir = NULL;
+ TESTEQUAL(bpf_test_trace("Read Dir"), 0);
+ TESTSYSCALL(umount(mount_dir));
+ FUSE_DAEMON
+ int backing_fd = -1;
+ struct fuse_bpf_entry_out bpf_entry[2];
+
+ TESTSYSCALL(s_mkdir(s_path(s(ft_src), s(backing_name)), 0777));
+ TESTERR(backing_fd = s_open(s_path(s(ft_src), s(backing_name)), O_RDONLY | O_CLOEXEC),
+ backing_fd != -1);
+ TESTSYSCALL(s_mkdir(s_pathn(3, s(ft_src), s(backing_name), s(names[0])), 0777));
+
+ TESTFUSEINIT();
+ TESTFUSELOOKUP(test_name, 0);
+
+ bpf_entry[0] = (struct fuse_bpf_entry_out) {
+ .entry_type = FUSE_ENTRY_BPF,
+ .name = "passthrough",
+ };
+ bpf_entry[1] = (struct fuse_bpf_entry_out) {
+ .entry_type = FUSE_ENTRY_BACKING,
+ .fd = backing_fd,
+ };
+ bpf_entry[0] = (struct fuse_bpf_entry_out) {
+ .entry_type = FUSE_ENTRY_BPF,
+ .name = "trace_ops",
+ };
+ bpf_entry[1] = (struct fuse_bpf_entry_out) {
+ .entry_type = FUSE_ENTRY_BACKING,
+ .fd = backing_fd,
+ };
+ TESTFUSEOUT3_IOCTL(fuse_entry_out, ((struct fuse_entry_out) {0}),
+ fuse_bpf_entry_out, bpf_entry[0],
+ fuse_bpf_entry_out, bpf_entry[1]);
+ TESTSYSCALL(close(backing_fd));
+ FUSE_DONE
+
+ result = TEST_SUCCESS;
+out:
+ if (!pid)
+ exit(TEST_FAILURE);
+ close(fuse_dev);
+ close(fd);
+ umount(mount_dir);
+ bpf_link__destroy(test_link);
+ test_bpf__destroy(test_skel);
+ return result;
+}
+
+static int bpf_test_remove_backing(const char *mount_dir)
+{
+ struct test_bpf *test_skel = NULL;
+ struct bpf_link *test_link = NULL;
+ const char *folder1 = "folder1";
+ const char *folder2 = "folder2";
+ const char *file = "file1";
+ const char *contents1 = "contents1";
+ const char *contents2 = "contents2";
+
+ int result = TEST_FAILURE;
+ int fuse_dev = -1;
+ int fd = -1;
+ int src_fd = -1;
+ int pid = -1;
+ int status;
+ char data[256] = {0};
+
+ /*
+ * Create folder1/file
+ * folder2/file
+ *
+ * test will install bpf into mount
+ * bpf will postfilter root lookup to daemon
+ * daemon will remove bpf and redirect opens on folder1 to folder2
+ * test will open folder1/file which will be redirected to folder2
+ * test will check no traces for file, and contents are folder2/file
+ */
+ TESTEQUAL(bpf_clear_trace(), 0);
+ TESTSYSCALL(s_mkdir(s_path(s(ft_src), s(folder1)), 0777));
+ TEST(fd = s_creat(s_pathn(3, s(ft_src), s(folder1), s(file)), 0777),
+ fd != -1);
+ TESTEQUAL(write(fd, contents1, strlen(contents1)), strlen(contents1));
+ TESTSYSCALL(close(fd));
+ TESTSYSCALL(s_mkdir(s_path(s(ft_src), s(folder2)), 0777));
+ TEST(fd = s_creat(s_pathn(3, s(ft_src), s(folder2), s(file)), 0777),
+ fd != -1);
+ TESTEQUAL(write(fd, contents2, strlen(contents2)), strlen(contents2));
+ TESTSYSCALL(close(fd));
+
+ TEST(src_fd = open(ft_src, O_DIRECTORY | O_RDONLY | O_CLOEXEC),
+ src_fd != -1);
+ TEST(test_skel = test_bpf__open_and_load(), test_skel != NULL);
+ TEST(test_link = bpf_map__attach_struct_ops(test_skel->maps.passthrough_ops), test_link != NULL);
+ TESTEQUAL(mount_fuse_no_init(mount_dir, "passthrough", src_fd, &fuse_dev), 0);
+
+ FUSE_ACTION
+ TESTERR(fd = s_open(s_pathn(3, s(mount_dir), s(folder1),
+ s(file)),
+ O_RDONLY | O_CLOEXEC), fd != -1);
+ TESTEQUAL(read(fd, data, sizeof(data)), strlen(contents2));
+ TESTCOND(!strcmp(data, contents2));
+ TESTEQUAL(bpf_test_no_trace("file"), 0);
+ TESTSYSCALL(close(fd));
+ fd = -1;
+ TESTSYSCALL(umount(mount_dir));
+ FUSE_DAEMON
+ // The bpf postfilter only sets one fuse_bpf_entry_out
+ struct in_str {
+ char name[8];
+ struct fuse_entry_out feo;
+ struct fuse_bpf_entry_out febo[1];
+ } __attribute__((packed));
+ uint32_t *err_in;
+ struct in_str *in;
+ int backing_fd = -1;
+ struct fuse_bpf_entry_out bpf_entry[2];
+
+ TESTFUSEINIT();
+ TESTFUSEIN_ERR_IN(FUSE_LOOKUP | FUSE_POSTFILTER, in, err_in);
+ TESTEQUAL(*err_in, 0);
+ TEST(backing_fd = s_open(s_path(s(ft_src), s(folder2)),
+ O_DIRECTORY | O_RDONLY | O_CLOEXEC),
+ backing_fd != -1);
+
+ bpf_entry[0] = (struct fuse_bpf_entry_out) {
+ .entry_type = FUSE_ENTRY_REMOVE_BPF,
+ };
+ bpf_entry[1] = (struct fuse_bpf_entry_out) {
+ .entry_type = FUSE_ENTRY_BACKING,
+ .fd = backing_fd,
+ };
+ TESTFUSEOUT3_IOCTL(fuse_entry_out, ((struct fuse_entry_out) {0}),
+ fuse_bpf_entry_out, bpf_entry[0],
+ fuse_bpf_entry_out, bpf_entry[1]);
+
+ while (read(fuse_dev, bytes_in, sizeof(bytes_in)) != -1)
+ ;
+ TESTSYSCALL(close(backing_fd));
+ FUSE_DONE
+
+ result = TEST_SUCCESS;
+out:
+ close(fuse_dev);
+ close(fd);
+ close(src_fd);
+ umount(mount_dir);
+ bpf_link__destroy(test_link);
+ test_bpf__destroy(test_skel);
+ return result;
+}
+
+static int bpf_test_dir_rename(const char *mount_dir)
+{
+ struct test_bpf *test_skel = NULL;
+ struct bpf_link *test_link = NULL;
+ const char *dir_name = "dir";
+ const char *dir_name2 = "dir2";
+ int result = TEST_FAILURE;
+ int src_fd = -1;
+ int fuse_dev = -1;
+ struct stat st;
+
+ TEST(src_fd = open(ft_src, O_DIRECTORY | O_RDONLY | O_CLOEXEC),
+ src_fd != -1);
+ TEST(test_skel = test_bpf__open_and_load(), test_skel != NULL);
+ TEST(test_link = bpf_map__attach_struct_ops(test_skel->maps.test_trace_ops), test_link != NULL);
+ TESTEQUAL(mount_fuse(mount_dir, "test_trace_ops", src_fd, &fuse_dev), 0);
+
+ TESTSYSCALL(s_mkdir(s_path(s(mount_dir), s(dir_name)), 0777));
+ TESTEQUAL(bpf_test_trace("mkdir"), 0);
+ TESTSYSCALL(s_stat(s_path(s(ft_src), s(dir_name)), &st));
+ TESTSYSCALL(s_rename(s_path(s(mount_dir), s(dir_name)),
+ s_path(s(mount_dir), s(dir_name2))));
+ TESTEQUAL(s_stat(s_path(s(ft_src), s(dir_name)), &st), -1);
+ TESTEQUAL(errno, ENOENT);
+ TESTSYSCALL(s_stat(s_path(s(ft_src), s(dir_name2)), &st));
+ result = TEST_SUCCESS;
+out:
+ close(fuse_dev);
+ umount(mount_dir);
+ close(src_fd);
+ bpf_link__destroy(test_link);
+ test_bpf__destroy(test_skel);
+ return result;
+}
+
+static int bpf_test_file_rename(const char *mount_dir)
+{
+ struct test_bpf *test_skel = NULL;
+ struct bpf_link *test_link = NULL;
+ const char *dir = "dir";
+ const char *file1 = "file1";
+ const char *file2 = "file2";
+ int result = TEST_FAILURE;
+ int src_fd = -1;
+ int fuse_dev = -1;
+ int fd = -1;
+
+ TEST(src_fd = open(ft_src, O_DIRECTORY | O_RDONLY | O_CLOEXEC),
+ src_fd != -1);
+ TEST(test_skel = test_bpf__open_and_load(), test_skel != NULL);
+ TEST(test_link = bpf_map__attach_struct_ops(test_skel->maps.test_trace_ops), test_link != NULL);
+ TESTEQUAL(mount_fuse(mount_dir, "test_trace_ops", src_fd, &fuse_dev), 0);
+
+ TESTSYSCALL(s_mkdir(s_path(s(mount_dir), s(dir)), 0777));
+ TEST(fd = s_creat(s_pathn(3, s(mount_dir), s(dir), s(file1)), 0777),
+ fd != -1);
+ TESTSYSCALL(s_rename(s_pathn(3, s(mount_dir), s(dir), s(file1)),
+ s_pathn(3, s(mount_dir), s(dir), s(file2))));
+ result = TEST_SUCCESS;
+out:
+ close(fd);
+ umount(mount_dir);
+ close(fuse_dev);
+ close(src_fd);
+ bpf_link__destroy(test_link);
+ test_bpf__destroy(test_skel);
+ return result;
+}
+
+static int mmap_test(const char *mount_dir)
+{
+ const char *file = "file";
+ int result = TEST_FAILURE;
+ int src_fd = -1;
+ int fuse_dev = -1;
+ int fd = -1;
+ char *addr = NULL;
+
+ TEST(src_fd = open(ft_src, O_DIRECTORY | O_RDONLY | O_CLOEXEC),
+ src_fd != -1);
+ TESTEQUAL(mount_fuse(mount_dir, NULL, src_fd, &fuse_dev), 0);
+ TEST(fd = s_open(s_path(s(mount_dir), s(file)),
+ O_CREAT | O_RDWR | O_CLOEXEC, 0777),
+ fd != -1);
+ TESTSYSCALL(fallocate(fd, 0, 4096, SEEK_CUR));
+ TEST(addr = mmap(NULL, 4096, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0),
+ addr != (void *) -1);
+ memset(addr, 'a', 4096);
+
+ result = TEST_SUCCESS;
+out:
+ munmap(addr, 4096);
+ close(fd);
+ umount(mount_dir);
+ close(fuse_dev);
+ close(src_fd);
+ return result;
+}
+
+static int readdir_perms_test(const char *mount_dir)
+{
+ int result = TEST_FAILURE;
+ struct __user_cap_header_struct uchs = { _LINUX_CAPABILITY_VERSION_3 };
+ struct __user_cap_data_struct ucds[2];
+ int src_fd = -1;
+ int fuse_dev = -1;
+ DIR *dir = NULL;
+
+ /* Must remove capabilities for this test. */
+ TESTSYSCALL(syscall(SYS_capget, &uchs, ucds));
+ ucds[0].effective &= ~(1 << CAP_DAC_OVERRIDE | 1 << CAP_DAC_READ_SEARCH);
+ TESTSYSCALL(syscall(SYS_capset, &uchs, ucds));
+
+ /* This is what we are testing in fuseland. First test without fuse, */
+ TESTSYSCALL(mkdir("test", 0111));
+ TEST(dir = opendir("test"), dir == NULL);
+ if (dir)
+ closedir(dir);
+ dir = NULL;
+
+ TEST(src_fd = open(ft_src, O_DIRECTORY | O_RDONLY | O_CLOEXEC),
+ src_fd != -1);
+ TESTEQUAL(mount_fuse(mount_dir, NULL, src_fd, &fuse_dev), 0);
+
+ TESTSYSCALL(s_mkdir(s_path(s(mount_dir), s("test")), 0111));
+ TEST(dir = s_opendir(s_path(s(mount_dir), s("test"))), dir == NULL);
+
+ result = TEST_SUCCESS;
+out:
+ ucds[0].effective |= 1 << CAP_DAC_OVERRIDE | 1 << CAP_DAC_READ_SEARCH;
+ syscall(SYS_capset, &uchs, ucds);
+
+ closedir(dir);
+ s_rmdir(s_path(s(mount_dir), s("test")));
+ umount(mount_dir);
+ close(fuse_dev);
+ close(src_fd);
+ rmdir("test");
+ return result;
+}
+
+static int bpf_test_statfs(const char *mount_dir)
+{
+ struct test_bpf *test_skel = NULL;
+ struct bpf_link *test_link = NULL;
+ int result = TEST_FAILURE;
+ int src_fd = -1;
+ int fuse_dev = -1;
+ int fd = -1;
+ struct statfs st;
+
+ TEST(src_fd = open(ft_src, O_DIRECTORY | O_RDONLY | O_CLOEXEC),
+ src_fd != -1);
+ TEST(test_skel = test_bpf__open_and_load(), test_skel != NULL);
+ TEST(test_link = bpf_map__attach_struct_ops(test_skel->maps.test_trace_ops), test_link != NULL);
+ TESTEQUAL(mount_fuse(mount_dir, "test_trace_ops", src_fd, &fuse_dev), 0);
+
+ TESTSYSCALL(s_statfs(s(mount_dir), &st));
+ TESTEQUAL(bpf_test_trace("statfs"), 0);
+ TESTEQUAL(st.f_type, 0x65735546);
+ result = TEST_SUCCESS;
+out:
+ close(fd);
+ umount(mount_dir);
+ close(fuse_dev);
+ close(src_fd);
+ bpf_link__destroy(test_link);
+ test_bpf__destroy(test_skel);
+ return result;
+}
+
+static int bpf_test_lseek(const char *mount_dir)
+{
+ struct test_bpf *test_skel = NULL;
+ struct bpf_link *test_link = NULL;
+ const char *file = "real";
+ const char *test_data = "data";
+ int result = TEST_FAILURE;
+ int src_fd = -1;
+ int fuse_dev = -1;
+ int fd = -1;
+
+ TEST(src_fd = open(ft_src, O_DIRECTORY | O_RDONLY | O_CLOEXEC),
+ src_fd != -1);
+ TEST(fd = openat(src_fd, file, O_CREAT | O_RDWR | O_CLOEXEC, 0777),
+ fd != -1);
+ TESTEQUAL(write(fd, test_data, strlen(test_data)), strlen(test_data));
+ TESTSYSCALL(close(fd));
+ fd = -1;
+ TEST(test_skel = test_bpf__open_and_load(), test_skel != NULL);
+ TEST(test_link = bpf_map__attach_struct_ops(test_skel->maps.test_trace_ops), test_link != NULL);
+ TESTEQUAL(mount_fuse(mount_dir, "test_trace_ops", src_fd, &fuse_dev), 0);
+
+ TEST(fd = s_open(s_path(s(mount_dir), s(file)), O_RDONLY | O_CLOEXEC),
+ fd != -1);
+ TESTEQUAL(lseek(fd, 3, SEEK_SET), 3);
+ TESTEQUAL(bpf_test_trace("lseek"), 0);
+ TESTEQUAL(lseek(fd, 5, SEEK_END), 9);
+ TESTEQUAL(bpf_test_trace("lseek"), 0);
+ TESTEQUAL(lseek(fd, 1, SEEK_CUR), 10);
+ TESTEQUAL(bpf_test_trace("lseek"), 0);
+ TESTEQUAL(lseek(fd, 1, SEEK_DATA), 1);
+ TESTEQUAL(bpf_test_trace("lseek"), 0);
+ result = TEST_SUCCESS;
+out:
+ close(fd);
+ umount(mount_dir);
+ close(fuse_dev);
+ close(src_fd);
+ bpf_link__destroy(test_link);
+ test_bpf__destroy(test_skel);
+ return result;
+}
+
+/*
+ * State:
+ * Original: dst/folder1/content.txt
+ * ^
+ * |
+ * |
+ * Backing: src/folder1/content.txt
+ *
+ * Step 1: open(folder1) - set backing to src/folder1
+ * Check 1: cat(content.txt) - check not receiving call on the fuse daemon
+ * and content is the same
+ * Step 2: readdirplus(dst)
+ * Check 2: cat(content.txt) - check not receiving call on the fuse daemon
+ * and content is the same
+ */
+static int bpf_test_readdirplus_not_overriding_backing(const char *mount_dir)
+{
+ const char *folder1 = "folder1";
+ const char *content_file = "content.txt";
+ const char *content = "hello world";
+
+ int result = TEST_FAILURE;
+ int fuse_dev = -1;
+ int src_fd = -1;
+ int content_fd = -1;
+ int pid = -1;
+ int status;
+
+ TESTSYSCALL(s_mkdir(s_path(s(ft_src), s(folder1)), 0777));
+ TEST(content_fd = s_creat(s_pathn(3, s(ft_src), s(folder1), s(content_file)), 0777),
+ content_fd != -1);
+ TESTEQUAL(write(content_fd, content, strlen(content)), strlen(content));
+ TESTEQUAL(mount_fuse_no_init(mount_dir, NULL, -1, &fuse_dev), 0);
+
+ FUSE_ACTION
+ DIR *open_mount_dir = NULL;
+ struct dirent *mount_dirent;
+ int dst_folder1_fd = -1;
+ int dst_content_fd = -1;
+ int dst_content_read_size = -1;
+ char content_buffer[12];
+
+ // Step 1: Lookup folder1
+ TESTERR(dst_folder1_fd = s_open(s_path(s(mount_dir), s(folder1)),
+ O_RDONLY | O_CLOEXEC), dst_folder1_fd != -1);
+
+ // Check 1: Read content file (backed)
+ TESTERR(dst_content_fd =
+ s_open(s_pathn(3, s(mount_dir), s(folder1), s(content_file)),
+ O_RDONLY | O_CLOEXEC), dst_content_fd != -1);
+
+ TEST(dst_content_read_size =
+ read(dst_content_fd, content_buffer, strlen(content)),
+ dst_content_read_size == strlen(content) &&
+ strcmp(content, content_buffer) == 0);
+
+ TESTSYSCALL(close(dst_content_fd));
+ dst_content_fd = -1;
+ TESTSYSCALL(close(dst_folder1_fd));
+ dst_folder1_fd = -1;
+ memset(content_buffer, 0, strlen(content));
+
+ // Step 2: readdir folder 1
+ TEST(open_mount_dir = s_opendir(s(mount_dir)),
+ open_mount_dir != NULL);
+ TEST(mount_dirent = readdir(open_mount_dir), mount_dirent != NULL);
+ TESTSYSCALL(closedir(open_mount_dir));
+ open_mount_dir = NULL;
+
+ // Check 2: Read content file again (must be backed)
+ TESTERR(dst_content_fd =
+ s_open(s_pathn(3, s(mount_dir), s(folder1), s(content_file)),
+ O_RDONLY | O_CLOEXEC), dst_content_fd != -1);
+
+ TEST(dst_content_read_size =
+ read(dst_content_fd, content_buffer, strlen(content)),
+ dst_content_read_size == strlen(content) &&
+ strcmp(content, content_buffer) == 0);
+
+ TESTSYSCALL(close(dst_content_fd));
+ dst_content_fd = -1;
+ FUSE_DAEMON
+ size_t read_size = 0;
+ struct fuse_in_header *in_header = (struct fuse_in_header *)bytes_in;
+ struct fuse_read_out *read_out = NULL;
+ struct fuse_attr attr = {};
+ int backing_fd = -1;
+ DECL_FUSE_IN(open);
+ DECL_FUSE_IN(getattr);
+
+ TESTFUSEINITFLAGS(FUSE_DO_READDIRPLUS | FUSE_READDIRPLUS_AUTO);
+
+ // Step 1: Lookup folder 1 with backing
+ TESTFUSELOOKUP(folder1, 0);
+ TESTSYSCALL(s_fuse_attr(s_path(s(ft_src), s(folder1)), &attr));
+ TEST(backing_fd = s_open(s_path(s(ft_src), s(folder1)),
+ O_DIRECTORY | O_RDONLY | O_CLOEXEC),
+ backing_fd != -1);
+ TESTFUSEOUT2_IOCTL(fuse_entry_out, ((struct fuse_entry_out) {
+ .nodeid = attr.ino,
+ .generation = 0,
+ .entry_valid = UINT64_MAX,
+ .attr_valid = UINT64_MAX,
+ .entry_valid_nsec = UINT32_MAX,
+ .attr_valid_nsec = UINT32_MAX,
+ .attr = attr,
+ }), fuse_bpf_entry_out, ((struct fuse_bpf_entry_out) {
+ .entry_type = FUSE_ENTRY_BACKING,
+ .fd = backing_fd,
+ }));
+ TESTSYSCALL(close(backing_fd));
+
+ // Step 2: Open root dir
+ TESTFUSEIN(FUSE_OPENDIR, open_in);
+ TESTFUSEOUT1(fuse_open_out, ((struct fuse_open_out) {
+ .fh = 100,
+ .open_flags = open_in->flags
+ }));
+
+ // Step 2: Handle getattr
+ TESTFUSEIN(FUSE_GETATTR, getattr_in);
+ TESTSYSCALL(s_fuse_attr(s(ft_src), &attr));
+ TESTFUSEOUT1(fuse_attr_out, ((struct fuse_attr_out) {
+ .attr_valid = UINT64_MAX,
+ .attr_valid_nsec = UINT32_MAX,
+ .attr = attr
+ }));
+
+ // Step 2: Handle readdirplus
+ read_size = read(fuse_dev, bytes_in, sizeof(bytes_in));
+ TESTEQUAL(in_header->opcode, FUSE_READDIRPLUS);
+
+ struct fuse_direntplus *dirent_plus =
+ (struct fuse_direntplus *) (bytes_in + read_size);
+ struct fuse_dirent dirent;
+ struct fuse_entry_out entry_out;
+
+ read_out = (struct fuse_read_out *) (bytes_in +
+ sizeof(*in_header) +
+ sizeof(struct fuse_read_in));
+
+ TESTSYSCALL(s_fuse_attr(s_path(s(ft_src), s(folder1)), &attr));
+
+ dirent = (struct fuse_dirent) {
+ .ino = attr.ino,
+ .off = 1,
+ .namelen = strlen(folder1),
+ .type = DT_REG
+ };
+ entry_out = (struct fuse_entry_out) {
+ .nodeid = attr.ino,
+ .generation = 0,
+ .entry_valid = UINT64_MAX,
+ .attr_valid = UINT64_MAX,
+ .entry_valid_nsec = UINT32_MAX,
+ .attr_valid_nsec = UINT32_MAX,
+ .attr = attr
+ };
+ *dirent_plus = (struct fuse_direntplus) {
+ .dirent = dirent,
+ .entry_out = entry_out
+ };
+
+ strcpy((char *)(bytes_in + read_size + sizeof(*dirent_plus)), folder1);
+ read_size += FUSE_DIRENT_ALIGN(sizeof(*dirent_plus) + strlen(folder1) +
+ 1);
+ TESTFUSEDIROUTREAD(read_out,
+ bytes_in +
+ sizeof(struct fuse_in_header) +
+ sizeof(struct fuse_read_in) +
+ sizeof(struct fuse_read_out),
+ read_size - sizeof(struct fuse_in_header) -
+ sizeof(struct fuse_read_in) -
+ sizeof(struct fuse_read_out));
+ FUSE_DONE
+
+ result = TEST_SUCCESS;
+
+out:
+ close(fuse_dev);
+ close(content_fd);
+ close(src_fd);
+ umount(mount_dir);
+ return result;
+}
+
+static int bpf_test_no_readdirplus_without_nodeid(const char *mount_dir)
+{
+ struct test_bpf *test_skel = NULL;
+ struct bpf_link *test_link = NULL;
+ const char *folder1 = "folder1";
+ const char *folder2 = "folder2";
+ int result = TEST_FAILURE;
+ int fuse_dev = -1;
+ int src_fd = -1;
+ int content_fd = -1;
+ int pid = -1;
+ int status;
+
+ TEST(test_skel = test_bpf__open_and_load(), test_skel != NULL);
+ TEST(test_link = bpf_map__attach_struct_ops(test_skel->maps.readdir_plus_ops), test_link != NULL);
+ TESTSYSCALL(s_mkdir(s_path(s(ft_src), s(folder1)), 0777));
+ TESTSYSCALL(s_mkdir(s_path(s(ft_src), s(folder2)), 0777));
+ TESTEQUAL(mount_fuse_no_init(mount_dir, NULL, -1, &fuse_dev), 0);
+ FUSE_ACTION
+ DIR *open_dir = NULL;
+ struct dirent *dirent;
+
+ // Folder 1: Readdir with no nodeid
+ TEST(open_dir = s_opendir(s_path(s(ft_dst), s(folder1))),
+ open_dir != NULL);
+ TEST(dirent = readdir(open_dir), dirent == NULL);
+ TESTCOND(errno == EINVAL);
+ TESTSYSCALL(closedir(open_dir));
+ open_dir = NULL;
+
+ // Folder 2: Readdir with a nodeid
+ TEST(open_dir = s_opendir(s_path(s(ft_dst), s(folder2))),
+ open_dir != NULL);
+ TEST(dirent = readdir(open_dir), dirent == NULL);
+ TESTCOND(errno == EINVAL);
+ TESTSYSCALL(closedir(open_dir));
+ open_dir = NULL;
+ FUSE_DAEMON
+ size_t read_size;
+ struct fuse_in_header *in_header = (struct fuse_in_header *)bytes_in;
+ struct fuse_attr attr = {};
+ int backing_fd = -1;
+ struct fuse_bpf_entry_out bpf_entry[2];
+
+ TESTFUSEINITFLAGS(FUSE_DO_READDIRPLUS | FUSE_READDIRPLUS_AUTO);
+
+ // folder 1: Set 0 as nodeid, Expect READDIR
+ TESTFUSELOOKUP(folder1, 0);
+ TEST(backing_fd = s_open(s_path(s(ft_src), s(folder1)),
+ O_DIRECTORY | O_RDONLY | O_CLOEXEC),
+ backing_fd != -1);
+
+ bpf_entry[0] = (struct fuse_bpf_entry_out) {
+ .entry_type = FUSE_ENTRY_BPF,
+ .name = "readdir_plus",
+ };
+ bpf_entry[1] = (struct fuse_bpf_entry_out) {
+ .entry_type = FUSE_ENTRY_BACKING,
+ .fd = backing_fd,
+ };
+ TESTFUSEOUT3_IOCTL(fuse_entry_out, ((struct fuse_entry_out) {
+ .nodeid = 0,
+ .generation = 0,
+ .entry_valid = UINT64_MAX,
+ .attr_valid = UINT64_MAX,
+ .entry_valid_nsec = UINT32_MAX,
+ .attr_valid_nsec = UINT32_MAX,
+ .attr = attr,
+ }), fuse_bpf_entry_out, bpf_entry[0],
+ fuse_bpf_entry_out, bpf_entry[1]);
+ TESTSYSCALL(close(backing_fd));
+ TEST(read_size = read(fuse_dev, bytes_in, sizeof(bytes_in)), read_size > 0);
+ TESTEQUAL(in_header->opcode, FUSE_READDIR);
+ TESTFUSEOUTERROR(-EINVAL);
+
+ // folder 2: Set 10 as nodeid, Expect READDIRPLUS
+ TESTFUSELOOKUP(folder2, 0);
+ TEST(backing_fd = s_open(s_path(s(ft_src), s(folder2)),
+ O_DIRECTORY | O_RDONLY | O_CLOEXEC),
+ backing_fd != -1);
+ bpf_entry[0] = (struct fuse_bpf_entry_out) {
+ .entry_type = FUSE_ENTRY_BPF,
+ .name = "readdir_plus",
+ };
+ bpf_entry[1] = (struct fuse_bpf_entry_out) {
+ .entry_type = FUSE_ENTRY_BACKING,
+ .fd = backing_fd,
+ };
+ TESTFUSEOUT3_IOCTL(fuse_entry_out, ((struct fuse_entry_out) {
+ .nodeid = 10,
+ .generation = 0,
+ .entry_valid = UINT64_MAX,
+ .attr_valid = UINT64_MAX,
+ .entry_valid_nsec = UINT32_MAX,
+ .attr_valid_nsec = UINT32_MAX,
+ .attr = attr,
+ }), fuse_bpf_entry_out, bpf_entry[0],
+ fuse_bpf_entry_out, bpf_entry[1]);
+ TESTSYSCALL(close(backing_fd));
+ TEST(read_size = read(fuse_dev, bytes_in, sizeof(bytes_in)), read_size > 0);
+ TESTEQUAL(in_header->opcode, FUSE_READDIRPLUS);
+ TESTFUSEOUTERROR(-EINVAL);
+ FUSE_DONE
+ result = TEST_SUCCESS;
+out:
+ close(fuse_dev);
+ close(content_fd);
+ close(src_fd);
+ umount(mount_dir);
+ bpf_link__destroy(test_link);
+ test_bpf__destroy(test_skel);
+ return result;
+}
+
+/*
+ * State:
+ * Original: dst/folder1/content.txt
+ * ^
+ * |
+ * |
+ * Backing: src/folder1/content.txt
+ *
+ * Step 1: open(folder1) - lookup folder1 with entry_timeout set to 0
+ * Step 2: open(folder1) - lookup folder1 again to trigger revalidate wich will
+ * set backing fd
+ *
+ * Check 1: cat(content.txt) - check not receiving call on the fuse daemon
+ * and content is the same
+ */
+static int bpf_test_revalidate_handle_backing_fd(const char *mount_dir)
+{
+ const char *folder1 = "folder1";
+ const char *content_file = "content.txt";
+ const char *content = "hello world";
+ int result = TEST_FAILURE;
+ int fuse_dev = -1;
+ int src_fd = -1;
+ int content_fd = -1;
+ int pid = -1;
+ int status;
+ TESTSYSCALL(s_mkdir(s_path(s(ft_src), s(folder1)), 0777));
+ TEST(content_fd = s_creat(s_pathn(3, s(ft_src), s(folder1), s(content_file)), 0777),
+ content_fd != -1);
+ TESTEQUAL(write(content_fd, content, strlen(content)), strlen(content));
+ TESTSYSCALL(close(content_fd));
+ content_fd = -1;
+ TESTEQUAL(mount_fuse_no_init(mount_dir, NULL, -1, &fuse_dev), 0);
+ FUSE_ACTION
+ int dst_folder1_fd = -1;
+ int dst_content_fd = -1;
+ int dst_content_read_size = -1;
+ char content_buffer[11] = {0};
+ // Step 1: Lookup folder1
+ TESTERR(dst_folder1_fd = s_open(s_path(s(mount_dir), s(folder1)),
+ O_RDONLY | O_CLOEXEC), dst_folder1_fd != -1);
+ TESTSYSCALL(close(dst_folder1_fd));
+ dst_folder1_fd = -1;
+ // Step 2: Lookup folder1 again
+ TESTERR(dst_folder1_fd = s_open(s_path(s(mount_dir), s(folder1)),
+ O_RDONLY | O_CLOEXEC), dst_folder1_fd != -1);
+ TESTSYSCALL(close(dst_folder1_fd));
+ dst_folder1_fd = -1;
+ // Check 1: Read content file (must be backed)
+ TESTERR(dst_content_fd =
+ s_open(s_pathn(3, s(mount_dir), s(folder1), s(content_file)),
+ O_RDONLY | O_CLOEXEC), dst_content_fd != -1);
+ TEST(dst_content_read_size =
+ read(dst_content_fd, content_buffer, strlen(content)),
+ dst_content_read_size == strlen(content) &&
+ strcmp(content, content_buffer) == 0);
+ TESTSYSCALL(close(dst_content_fd));
+ dst_content_fd = -1;
+ FUSE_DAEMON
+ struct fuse_attr attr = {};
+ int backing_fd = -1;
+ TESTFUSEINITFLAGS(FUSE_DO_READDIRPLUS | FUSE_READDIRPLUS_AUTO);
+ // Step 1: Lookup folder1 set entry_timeout to 0 to trigger
+ // revalidate later
+ TESTFUSELOOKUP(folder1, 0);
+ TESTSYSCALL(s_fuse_attr(s_path(s(ft_src), s(folder1)), &attr));
+ TEST(backing_fd = s_open(s_path(s(ft_src), s(folder1)),
+ O_DIRECTORY | O_RDONLY | O_CLOEXEC),
+ backing_fd != -1);
+ TESTFUSEOUT2_IOCTL(fuse_entry_out, ((struct fuse_entry_out) {
+ .nodeid = attr.ino,
+ .generation = 0,
+ .entry_valid = 0,
+ .attr_valid = UINT64_MAX,
+ .entry_valid_nsec = 0,
+ .attr_valid_nsec = UINT32_MAX,
+ .attr = attr,
+ }), fuse_bpf_entry_out, ((struct fuse_bpf_entry_out) {
+ .entry_type = FUSE_ENTRY_BACKING,
+ .fd = backing_fd,
+ }));
+ TESTSYSCALL(close(backing_fd));
+ // Step 1: Lookup folder1 as a reaction to revalidate call
+ // This attempts to change the backing node, which is not allowed on revalidate
+ TESTFUSELOOKUP(folder1, 0);
+ TESTSYSCALL(s_fuse_attr(s_path(s(ft_src), s(folder1)), &attr));
+ TEST(backing_fd = s_open(s_path(s(ft_src), s(folder1)),
+ O_DIRECTORY | O_RDONLY | O_CLOEXEC),
+ backing_fd != -1);
+ TESTFUSEOUT2_IOCTL(fuse_entry_out, ((struct fuse_entry_out) {
+ .nodeid = attr.ino,
+ .generation = 0,
+ .entry_valid = UINT64_MAX,
+ .attr_valid = UINT64_MAX,
+ .entry_valid_nsec = UINT32_MAX,
+ .attr_valid_nsec = UINT32_MAX,
+ .attr = attr,
+ }), fuse_bpf_entry_out, ((struct fuse_bpf_entry_out) {
+ .entry_type = FUSE_ENTRY_BACKING,
+ .fd = backing_fd,
+ }));
+ TESTSYSCALL(close(backing_fd));
+
+ // Lookup folder1 as a reaction to failed revalidate
+ TESTFUSELOOKUP(folder1, 0);
+ TESTSYSCALL(s_fuse_attr(s_path(s(ft_src), s(folder1)), &attr));
+ TEST(backing_fd = s_open(s_path(s(ft_src), s(folder1)),
+ O_DIRECTORY | O_RDONLY | O_CLOEXEC),
+ backing_fd != -1);
+ TESTFUSEOUT2_IOCTL(fuse_entry_out, ((struct fuse_entry_out) {
+ .nodeid = attr.ino,
+ .generation = 0,
+ .entry_valid = UINT64_MAX,
+ .attr_valid = UINT64_MAX,
+ .entry_valid_nsec = UINT32_MAX,
+ .attr_valid_nsec = UINT32_MAX,
+ .attr = attr,
+ }), fuse_bpf_entry_out, ((struct fuse_bpf_entry_out) {
+ .entry_type = FUSE_ENTRY_BACKING,
+ .fd = backing_fd,
+ }));
+ TESTSYSCALL(close(backing_fd));
+ FUSE_DONE
+ result = TEST_SUCCESS;
+out:
+ close(fuse_dev);
+ close(content_fd);
+ close(src_fd);
+ umount(mount_dir);
+ return result;
+}
+
+static int bpf_test_lookup_postfilter(const char *mount_dir)
+{
+ struct test_bpf *test_skel = NULL;
+ struct bpf_link *test_link = NULL;
+ const char *file1_name = "file1";
+ const char *file2_name = "file2";
+ const char *file3_name = "file3";
+ int result = TEST_FAILURE;
+ int bpf_fd = -1;
+ int src_fd = -1;
+ int fuse_dev = -1;
+ int file_fd = -1;
+ int pid = -1;
+ int status;
+
+ TEST(file_fd = s_creat(s_path(s(ft_src), s(file1_name)), 0777),
+ file_fd != -1);
+ TESTSYSCALL(close(file_fd));
+ TEST(file_fd = s_creat(s_path(s(ft_src), s(file2_name)), 0777),
+ file_fd != -1);
+ TESTSYSCALL(close(file_fd));
+ file_fd = -1;
+ TEST(src_fd = open(ft_src, O_DIRECTORY | O_RDONLY | O_CLOEXEC),
+ src_fd != -1);
+ TEST(test_skel = test_bpf__open_and_load(), test_skel != NULL);
+ TEST(test_link = bpf_map__attach_struct_ops(test_skel->maps.lookup_postfilter_ops), test_link != NULL);
+ TESTEQUAL(mount_fuse(mount_dir, "lookup_post", src_fd, &fuse_dev), 0);
+ FUSE_ACTION
+ int fd = -1;
+
+ TESTEQUAL(s_open(s_path(s(mount_dir), s(file1_name)), O_RDONLY),
+ -1);
+ TESTEQUAL(errno, ENOENT);
+ TEST(fd = s_open(s_path(s(mount_dir), s(file2_name)), O_RDONLY),
+ fd != -1);
+ TESTSYSCALL(close(fd));
+ TESTEQUAL(s_open(s_path(s(mount_dir), s(file3_name)), O_RDONLY),
+ -1);
+ FUSE_DAEMON
+ struct fuse_entry_out *feo;
+ uint32_t *err_in;
+
+ TESTFUSELOOKUP(file1_name, FUSE_POSTFILTER);
+ TESTFUSEOUTERROR(-ENOENT);
+
+ TESTFUSELOOKUP(file2_name, FUSE_POSTFILTER);
+ feo = (struct fuse_entry_out *) (bytes_in +
+ sizeof(struct fuse_in_header) + strlen(file2_name) + 1);
+ TESTFUSEOUT1(fuse_entry_out, *feo);
+
+ TESTFUSELOOKUP_POST_ERRIN(file3_name, err_in);
+ TESTEQUAL(*err_in, -ENOENT);
+ TESTFUSEOUTERROR(-ENOENT);
+ FUSE_DONE
+
+ result = TEST_SUCCESS;
+out:
+ close(file_fd);
+ close(fuse_dev);
+ umount(mount_dir);
+ close(src_fd);
+ close(bpf_fd);
+ bpf_link__destroy(test_link);
+ test_bpf__destroy(test_skel);
+ return result;
+}
+
+static void parse_range(const char *ranges, bool *run_test, size_t tests)
+{
+ size_t i;
+ char *range;
+
+ for (i = 0; i < tests; ++i)
+ run_test[i] = false;
+
+ range = strtok(optarg, ",");
+ while (range) {
+ char *dash = strchr(range, '-');
+
+ if (dash) {
+ size_t start = 1, end = tests;
+ char *end_ptr;
+
+ if (dash > range) {
+ start = strtol(range, &end_ptr, 10);
+ if (*end_ptr != '-' || start <= 0 || start > tests)
+ ksft_exit_fail_msg("Bad range\n");
+ }
+
+ if (dash[1]) {
+ end = strtol(dash + 1, &end_ptr, 10);
+ if (*end_ptr || end <= start || end > tests)
+ ksft_exit_fail_msg("Bad range\n");
+ }
+
+ for (i = start; i <= end; ++i)
+ run_test[i - 1] = true;
+ } else {
+ char *end;
+ long value = strtol(range, &end, 10);
+
+ if (*end || value <= 0 || value > tests)
+ ksft_exit_fail_msg("Bad range\n");
+ run_test[value - 1] = true;
+ }
+ range = strtok(NULL, ",");
+ }
+}
+
+static int parse_options(int argc, char *const *argv, bool *run_test,
+ size_t tests)
+{
+ signed char c;
+
+ while ((c = getopt(argc, argv, "f:t:v")) != -1)
+ switch (c) {
+ case 'f':
+ test_options.file = strtol(optarg, NULL, 10);
+ break;
+
+ case 't':
+ parse_range(optarg, run_test, tests);
+ break;
+
+ case 'v':
+ test_options.verbose = true;
+ break;
+
+ default:
+ return -EINVAL;
+ }
+
+ return 0;
+}
+
+struct test_case {
+ int (*pfunc)(const char *dir);
+ const char *name;
+};
+
+static void run_one_test(const char *mount_dir,
+ const struct test_case *test_case)
+{
+ ksft_print_msg("Running %s\n", test_case->name);
+ bpf_clear_trace();
+ if (test_case->pfunc(mount_dir) == TEST_SUCCESS)
+ ksft_test_result_pass("%s\n", test_case->name);
+ else
+ ksft_test_result_fail("%s\n", test_case->name);
+}
+
+int main(int argc, char *argv[])
+{
+ char *mount_dir = NULL;
+ char *src_dir = NULL;
+ int i;
+ int fd, count;
+
+#define MAKE_TEST(test) \
+ { \
+ test, #test \
+ }
+ const struct test_case cases[] = {
+ MAKE_TEST(basic_test),
+ MAKE_TEST(bpf_test_real),
+ MAKE_TEST(bpf_test_partial),
+ MAKE_TEST(bpf_test_attrs),
+ MAKE_TEST(bpf_test_readdir),
+ MAKE_TEST(bpf_test_creat),
+ MAKE_TEST(bpf_test_hidden_entries),
+ MAKE_TEST(bpf_test_dir),
+ MAKE_TEST(bpf_test_file_early_close),
+ MAKE_TEST(bpf_test_file_late_close),
+ MAKE_TEST(bpf_test_mknod),
+ MAKE_TEST(bpf_test_largedir),
+ MAKE_TEST(bpf_test_link),
+ MAKE_TEST(bpf_test_symlink),
+ MAKE_TEST(bpf_test_xattr),
+ MAKE_TEST(bpf_test_redact_readdir),
+ MAKE_TEST(bpf_test_set_backing),
+ MAKE_TEST(bpf_test_set_backing_no_ioctl),
+ MAKE_TEST(bpf_test_set_backing_folder),
+ MAKE_TEST(bpf_test_remove_backing),
+ MAKE_TEST(bpf_test_dir_rename),
+ MAKE_TEST(bpf_test_file_rename),
+ MAKE_TEST(bpf_test_alter_errcode_bpf),
+ MAKE_TEST(bpf_test_alter_errcode_userspace),
+ MAKE_TEST(mmap_test),
+ MAKE_TEST(readdir_perms_test),
+ MAKE_TEST(bpf_test_statfs),
+ MAKE_TEST(bpf_test_lseek),
+ MAKE_TEST(bpf_test_readdirplus_not_overriding_backing),
+ MAKE_TEST(bpf_test_no_readdirplus_without_nodeid),
+ MAKE_TEST(bpf_test_revalidate_handle_backing_fd),
+ MAKE_TEST(bpf_test_lookup_postfilter),
+ //MAKE_TEST(bpf_test_verifier),
+ //MAKE_TEST(bpf_test_verifier_out_args),
+ //MAKE_TEST(bpf_test_verifier_packet_invalidation),
+ //MAKE_TEST(bpf_test_verifier_nonsense_read)
+ };
+#undef MAKE_TEST
+
+ bool run_test[ARRAY_SIZE(cases)];
+
+ for (int i = 0; i < ARRAY_SIZE(cases); ++i)
+ run_test[i] = true;
+
+ if (parse_options(argc, argv, run_test, ARRAY_SIZE(cases)))
+ ksft_exit_fail_msg("Bad options\n");
+
+ // Seed randomness pool for testing on QEMU
+ // NOTE - this abuses the concept of randomness - do *not* ever do this
+ // on a machine for production use - the device will think it has good
+ // randomness when it does not.
+ fd = open("/dev/urandom", O_WRONLY | O_CLOEXEC);
+ count = 4096;
+ for (int i = 0; i < 128; ++i)
+ ioctl(fd, RNDADDTOENTCNT, &count);
+ close(fd);
+
+ ksft_print_header();
+
+ if (geteuid() != 0)
+ ksft_print_msg("Not a root, might fail to mount.\n");
+
+ if (tracing_on() != TEST_SUCCESS)
+ ksft_exit_fail_msg("Can't turn on tracing\n");
+
+ src_dir = setup_mount_dir(ft_src);
+ mount_dir = setup_mount_dir(ft_dst);
+ if (src_dir == NULL || mount_dir == NULL)
+ ksft_exit_fail_msg("Can't create a mount dir\n");
+
+ ksft_set_plan(ARRAY_SIZE(run_test));
+
+ for (i = 0; i < ARRAY_SIZE(run_test); ++i)
+ if (run_test[i]) {
+ delete_dir_tree(mount_dir, false);
+ delete_dir_tree(src_dir, false);
+ run_one_test(mount_dir, &cases[i]);
+ } else
+ ksft_cnt.ksft_xskip++;
+
+ umount2(mount_dir, MNT_FORCE);
+ delete_dir_tree(mount_dir, true);
+ delete_dir_tree(src_dir, true);
+ return !ksft_get_fail_cnt() ? ksft_exit_pass() : ksft_exit_fail();
+}
diff --git a/tools/testing/selftests/filesystems/fuse/test.bpf.c b/tools/testing/selftests/filesystems/fuse/test.bpf.c
new file mode 100644
index 000000000000..3128bf50016f
--- /dev/null
+++ b/tools/testing/selftests/filesystems/fuse/test.bpf.c
@@ -0,0 +1,996 @@
+// SPDX-License-Identifier: GPL-2.0 OR BSD-3-Clause
+// Copyright (c) 2021 Google LLC
+
+#include "vmlinux.h"
+#include <linux/errno.h>
+#include <linux/types.h>
+#include <bpf/bpf_tracing.h>
+#include <bpf/bpf_helpers.h>
+
+#include <stdbool.h>
+
+#include "bpf_common.h"
+
+char _license[] SEC("license") = "GPL";
+
+#if 0
+inline __always_inline int local_strcmp(const char *a, const char *b)
+{
+ int i;
+
+ for (i = 0; i < __builtin_strlen(b) + 1; ++i)
+ if (a[i] != b[i])
+ return -1;
+ return 0;
+}
+
+
+/* This is a macro to enforce inlining. Without it, the compiler will do the wrong thing for bpf */
+#define strcmp_check(a, b, end_b) \
+ (((b) + __builtin_strlen(a) + 1 > (end_b)) ? -1 : local_strcmp((b), (a)))
+#endif
+
+//trace ops
+
+BPF_STRUCT_OPS(uint32_t, trace_access_prefilter, const struct bpf_fuse_meta_info *meta,
+ struct fuse_access_in *in)
+{
+ bpf_printk("Access: %d", meta->nodeid);
+ return BPF_FUSE_CONTINUE;
+}
+
+BPF_STRUCT_OPS(uint32_t, trace_getattr_prefilter, const struct bpf_fuse_meta_info *meta,
+ struct fuse_getattr_in *in)
+{
+ bpf_printk("Get Attr %d", in->fh);
+ return BPF_FUSE_CONTINUE;
+}
+
+BPF_STRUCT_OPS(uint32_t, trace_setattr_prefilter, const struct bpf_fuse_meta_info *meta,
+ struct fuse_setattr_in *in)
+{
+ bpf_printk("Set Attr %d", in->fh);
+ return BPF_FUSE_CONTINUE;
+}
+
+BPF_STRUCT_OPS(uint32_t, trace_opendir_prefilter, const struct bpf_fuse_meta_info *meta,
+ struct fuse_open_in *in)
+{
+ bpf_printk("Open Dir: %d", meta->nodeid);
+ return BPF_FUSE_CONTINUE;
+}
+
+BPF_STRUCT_OPS(uint32_t, trace_readdir_prefilter, const struct bpf_fuse_meta_info *meta,
+ struct fuse_read_in *in)
+{
+ bpf_printk("Read Dir: fh: %lu", in->fh, in->offset);
+ return BPF_FUSE_CONTINUE;
+}
+
+BPF_STRUCT_OPS(uint32_t, trace_lookup_prefilter, const struct bpf_fuse_meta_info *meta,
+ struct fuse_buffer *name)
+{
+ struct bpf_dynptr name_ptr;
+ char *name_buf;
+
+ bpf_fuse_get_ro_dynptr(name, &name_ptr);
+ name_buf = bpf_dynptr_slice(&name_ptr, 0, NULL, 1);
+ bpf_printk("Lookup: %lx %s", meta->nodeid, name_buf);
+ if (meta->nodeid == 1)
+ return BPF_FUSE_USER_PREFILTER;
+ else
+ return BPF_FUSE_CONTINUE;
+}
+
+BPF_STRUCT_OPS(uint32_t, trace_mknod_prefilter, const struct bpf_fuse_meta_info *meta,
+ struct fuse_mknod_in *in, struct fuse_buffer *name)
+{
+ struct bpf_dynptr name_ptr;
+ char name_buf[255];
+
+ bpf_fuse_get_ro_dynptr(name, &name_ptr);
+ bpf_dynptr_read(name_buf, 255, &name_ptr, 0, 0);
+ bpf_printk("mknod %s %x %x", name_buf, in->rdev | in->mode, in->umask);
+ return BPF_FUSE_CONTINUE;
+}
+
+BPF_STRUCT_OPS(uint32_t, trace_mkdir_prefilter, const struct bpf_fuse_meta_info *meta,
+ struct fuse_mkdir_in *in, struct fuse_buffer *name)
+{
+ struct bpf_dynptr name_ptr;
+ char name_buf[255];
+
+ bpf_fuse_get_ro_dynptr(name, &name_ptr);
+ bpf_dynptr_read(name_buf, 255, &name_ptr, 0, 0);
+ bpf_printk("mkdir: %s %x %x", name_buf, in->mode, in->umask);
+ return BPF_FUSE_CONTINUE;
+}
+
+BPF_STRUCT_OPS(uint32_t, trace_rmdir_prefilter, const struct bpf_fuse_meta_info *meta,
+ struct fuse_buffer *name)
+{
+ struct bpf_dynptr name_ptr;
+ char name_buf[255];
+
+ bpf_fuse_get_ro_dynptr(name, &name_ptr);
+ bpf_dynptr_read(name_buf, 255, &name_ptr, 0, 0);
+ bpf_printk("rmdir: %s", name_buf);
+ return BPF_FUSE_CONTINUE;
+}
+
+BPF_STRUCT_OPS(uint32_t, trace_rename_prefilter, const struct bpf_fuse_meta_info *meta,
+ struct fuse_rename_in *in, struct fuse_buffer *old_name,
+ struct fuse_buffer *new_name)
+{
+ struct bpf_dynptr old_name_ptr;
+ struct bpf_dynptr new_name_ptr;
+ char old_name_buf[255];
+ //char new_name_buf[255];
+
+ bpf_fuse_get_ro_dynptr(old_name, &old_name_ptr);
+ //bpf_fuse_get_ro_dynptr(new_name, &new_name_ptr);
+ bpf_dynptr_read(old_name_buf, 255, &old_name_ptr, 0, 0);
+ //bpf_dynptr_read(new_name_buf, 255, &new_name_ptr, 0, 0);
+ bpf_printk("rename from %s", old_name_buf);
+ //bpf_printk("rename to %s", new_name_buf);
+ return BPF_FUSE_CONTINUE;
+}
+
+BPF_STRUCT_OPS(uint32_t, trace_rename2_prefilter, const struct bpf_fuse_meta_info *meta,
+ struct fuse_rename2_in *in, struct fuse_buffer *old_name,
+ struct fuse_buffer *new_name)
+{
+ struct bpf_dynptr old_name_ptr;
+ //struct bpf_dynptr new_name_ptr;
+ char old_name_buf[255];
+ //char new_name_buf[255];
+
+ bpf_fuse_get_ro_dynptr(old_name, &old_name_ptr);
+ //bpf_fuse_get_ro_dynptr(new_name, &new_name_ptr);
+ bpf_dynptr_read(old_name_buf, 255, &old_name_ptr, 0, 0);
+ //bpf_dynptr_read(new_name_buf, 255, &new_name_ptr, 0, 0);
+ bpf_printk("rename(%x) from %s", in->flags, old_name_buf);
+ //bpf_printk("rename to %s", new_name_buf);
+ return BPF_FUSE_CONTINUE;
+}
+
+BPF_STRUCT_OPS(uint32_t, trace_unlink_prefilter, const struct bpf_fuse_meta_info *meta,
+ struct fuse_buffer *name)
+{
+ struct bpf_dynptr name_ptr;
+ char name_buf[255];
+
+ bpf_fuse_get_ro_dynptr(name, &name_ptr);
+ bpf_dynptr_read(name_buf, 255, &name_ptr, 0, 0);
+ bpf_printk("unlink: %s", name_buf);
+ return BPF_FUSE_CONTINUE;
+}
+
+BPF_STRUCT_OPS(uint32_t, trace_link_prefilter, const struct bpf_fuse_meta_info *meta,
+ struct fuse_link_in *in, struct fuse_buffer *name)
+{
+ struct bpf_dynptr name_ptr;
+ char dst_name[255];
+
+ bpf_fuse_get_ro_dynptr(name, &name_ptr);
+ bpf_dynptr_read(dst_name, 255, &name_ptr, 0, 0);
+ bpf_printk("link: %d %s", in->oldnodeid, dst_name);
+ return BPF_FUSE_CONTINUE;
+}
+
+BPF_STRUCT_OPS(uint32_t, trace_symlink_prefilter, const struct bpf_fuse_meta_info *meta,
+ struct fuse_buffer *name, struct fuse_buffer *path)
+{
+ struct bpf_dynptr name_ptr;
+ //struct bpf_dynptr path_ptr;
+ char link_name[255];
+ //char link_path[4096];
+
+ bpf_fuse_get_ro_dynptr(name, &name_ptr);
+ //bpf_fuse_get_ro_dynptr(path, &path_ptr);
+ bpf_dynptr_read(link_name, 255, &name_ptr, 0, 0);
+ //bpf_dynptr_read(link_path, 4096, &path_ptr, 0, 0);
+
+ bpf_printk("symlink from %s", link_name);
+ //bpf_printk("symlink to %s", link_path);
+ return BPF_FUSE_CONTINUE;
+}
+
+BPF_STRUCT_OPS(uint32_t, trace_get_link_prefilter, const struct bpf_fuse_meta_info *meta,
+ struct fuse_buffer *name)
+{
+ struct bpf_dynptr name_ptr;
+ char link_name[255];
+
+ bpf_fuse_get_ro_dynptr(name, &name_ptr);
+ bpf_dynptr_read(link_name, 255, &name_ptr, 0, 0);
+ bpf_printk("readlink from %s", link_name);
+ return BPF_FUSE_CONTINUE;
+}
+
+BPF_STRUCT_OPS(uint32_t, trace_release_prefilter, const struct bpf_fuse_meta_info *meta,
+ struct fuse_release_in *in)
+{
+ bpf_printk("Release: %d", in->fh);
+ return BPF_FUSE_CONTINUE;
+}
+
+BPF_STRUCT_OPS(uint32_t, trace_releasedir_prefilter, const struct bpf_fuse_meta_info *meta,
+ struct fuse_release_in *in)
+{
+ bpf_printk("Release Dir: %d", in->fh);
+ return BPF_FUSE_CONTINUE;
+}
+
+BPF_STRUCT_OPS(uint32_t, trace_create_open_prefilter, const struct bpf_fuse_meta_info *meta,
+ struct fuse_create_in *in, struct fuse_buffer *name)
+{
+ struct bpf_dynptr name_ptr;
+ char name_buf[255];
+
+ bpf_fuse_get_ro_dynptr(name, &name_ptr);
+ bpf_dynptr_read(name_buf, 255, &name_ptr, 0, 0);
+ bpf_printk("Create %s", name_buf);
+ return BPF_FUSE_CONTINUE;
+}
+
+BPF_STRUCT_OPS(uint32_t, trace_open_prefilter, const struct bpf_fuse_meta_info *meta,
+ struct fuse_open_in *in)
+{
+ bpf_printk("Open: %d", meta->nodeid);
+ return BPF_FUSE_CONTINUE;
+}
+
+BPF_STRUCT_OPS(uint32_t, trace_read_iter_prefilter, const struct bpf_fuse_meta_info *meta,
+ struct fuse_read_in *in)
+{
+ bpf_printk("Read: fh: %lu, offset %lu, size %lu",
+ in->fh, in->offset, in->size);
+ return BPF_FUSE_CONTINUE;
+}
+
+BPF_STRUCT_OPS(uint32_t, trace_write_iter_prefilter, const struct bpf_fuse_meta_info *meta,
+ struct fuse_write_in *in)
+{
+ bpf_printk("Write: fh: %lu, offset %lu, size %lu",
+ in->fh, in->offset, in->size);
+ return BPF_FUSE_CONTINUE;
+}
+
+BPF_STRUCT_OPS(uint32_t, trace_flush_prefilter, const struct bpf_fuse_meta_info *meta,
+ struct fuse_flush_in *in)
+{
+ bpf_printk("flush %d", in->fh);
+ return BPF_FUSE_CONTINUE;
+}
+
+BPF_STRUCT_OPS(uint32_t, trace_file_fallocate_prefilter, const struct bpf_fuse_meta_info *meta,
+ struct fuse_fallocate_in *in)
+{
+ bpf_printk("fallocate %d %lu", in->fh, in->length);
+ return BPF_FUSE_CONTINUE;
+}
+
+BPF_STRUCT_OPS(uint32_t, trace_getxattr_prefilter, const struct bpf_fuse_meta_info *meta,
+ struct fuse_getxattr_in *in, struct fuse_buffer *name)
+{
+ struct bpf_dynptr name_ptr;
+ char name_buf[255];
+
+ bpf_fuse_get_ro_dynptr(name, &name_ptr);
+ bpf_dynptr_read(name_buf, 255, &name_ptr, 0, 0);
+ bpf_printk("getxattr %d %s", meta->nodeid, name_buf);
+ return BPF_FUSE_CONTINUE;
+}
+
+BPF_STRUCT_OPS(uint32_t, trace_listxattr_prefilter, const struct bpf_fuse_meta_info *meta,
+ struct fuse_getxattr_in *in)
+{
+ bpf_printk("listxattr %d %d", meta->nodeid, in->size);
+ return BPF_FUSE_CONTINUE;
+}
+
+BPF_STRUCT_OPS(uint32_t, trace_setxattr_prefilter, const struct bpf_fuse_meta_info *meta,
+ struct fuse_setxattr_in *in, struct fuse_buffer *name,
+ struct fuse_buffer *value)
+{
+ struct bpf_dynptr name_ptr;
+ char name_buf[255];
+
+ bpf_fuse_get_ro_dynptr(name, &name_ptr);
+ bpf_dynptr_read(name_buf, 255, &name_ptr, 0, 0);
+ bpf_printk("setxattr %d %s", meta->nodeid, name_buf);
+ return BPF_FUSE_CONTINUE;
+}
+
+BPF_STRUCT_OPS(uint32_t, trace_removexattr_prefilter, const struct bpf_fuse_meta_info *meta,
+ struct fuse_buffer *name)
+{
+ struct bpf_dynptr name_ptr;
+ char name_buf[255];
+
+ bpf_fuse_get_ro_dynptr(name, &name_ptr);
+ bpf_dynptr_read(name_buf, 255, &name_ptr, 0, 0);
+ bpf_printk("removexattr %d %s", meta->nodeid, name_buf);
+ return BPF_FUSE_CONTINUE;
+}
+
+BPF_STRUCT_OPS(uint32_t, trace_statfs_prefilter, const struct bpf_fuse_meta_info *meta)
+{
+ bpf_printk("statfs %d", meta->nodeid);
+ return BPF_FUSE_CONTINUE;
+}
+
+BPF_STRUCT_OPS(uint32_t, trace_lseek_prefilter, const struct bpf_fuse_meta_info *meta,
+ struct fuse_lseek_in *in)
+{
+ bpf_printk("lseek type:%d, offset:%lld", in->whence, in->offset);
+ return BPF_FUSE_CONTINUE;
+}
+
+// readdir_test_ops
+BPF_STRUCT_OPS(uint32_t, readdir_redact_prefilter, const struct bpf_fuse_meta_info *meta,
+ struct fuse_read_in *in)
+{
+ bpf_printk("readdir %d", in->fh);
+ return BPF_FUSE_POSTFILTER;
+}
+
+BPF_STRUCT_OPS(uint32_t, readdir_redact_postfilter, const struct bpf_fuse_meta_info *meta,
+ const struct fuse_read_in *in,
+ struct fuse_read_out *out, struct fuse_buffer *buffer)
+{
+ bpf_printk("readdir postfilter %x", in->fh);
+ return BPF_FUSE_USER_POSTFILTER;
+}
+
+// test operations
+
+BPF_STRUCT_OPS(uint32_t, test_lookup_prefilter, const struct bpf_fuse_meta_info *meta,
+ struct fuse_buffer *name)
+{
+ struct bpf_dynptr name_ptr;
+ char *name_buf;
+ bool backing = false;
+ int ret;
+
+ bpf_fuse_get_ro_dynptr(name, &name_ptr);
+
+ /* bpf_dynptr_slice will only return a pointer if the dynptr is long enough */
+ name_buf = bpf_dynptr_slice(&name_ptr, 0, NULL, 8);
+ if (name_buf) {
+ if (bpf_strncmp(name_buf, 8, "partial") == 0)
+ backing = true;
+ goto print;
+ }
+
+ name_buf = bpf_dynptr_slice(&name_ptr, 0, NULL, 6);
+ if (name_buf) {
+ if (bpf_strncmp(name_buf, 6, "file1") == 0)
+ backing = true;
+ if (bpf_strncmp(name_buf, 6, "file2") == 0)
+ backing = true;
+ goto print;
+ }
+
+ name_buf = bpf_dynptr_slice(&name_ptr, 0, NULL, 5);
+ if (name_buf) {
+ if (bpf_strncmp(name_buf, 5, "dir2") == 0)
+ backing = true;
+ if (bpf_strncmp(name_buf, 5, "real") == 0)
+ backing = true;
+ goto print;
+ }
+
+ name_buf = bpf_dynptr_slice(&name_ptr, 0, NULL, 4);
+ if (name_buf) {
+ if (bpf_strncmp(name_buf, 4, "dir") == 0)
+ backing = true;
+ goto print;
+ }
+print:
+ if (name_buf)
+ bpf_printk("lookup %s %d", name_buf, backing);
+ else
+ bpf_printk("lookup [name length under 3] %d", backing);
+ return backing ? BPF_FUSE_POSTFILTER : BPF_FUSE_USER;
+}
+
+BPF_STRUCT_OPS(uint32_t, test_lookup_postfilter, const struct bpf_fuse_meta_info *meta,
+ const struct fuse_buffer *name,
+ struct fuse_entry_out *out, struct fuse_buffer *entries)
+{
+ struct bpf_dynptr name_ptr;
+ char *name_buf;
+
+ bpf_fuse_get_ro_dynptr(name, &name_ptr);
+
+ name_buf = bpf_dynptr_slice(&name_ptr, 0, NULL, 8);
+ if (name_buf) {
+ if (bpf_strncmp(name_buf, 8, "partial") == 0)
+ out->nodeid = 6;
+ goto print;
+ }
+
+ name_buf = bpf_dynptr_slice(&name_ptr, 0, NULL, 5);
+ if (name_buf) {
+ if (bpf_strncmp(name_buf, 5, "real") == 0)
+ out->nodeid = 5;
+ goto print;
+ }
+print:
+ if (name_buf)
+ bpf_printk("post-lookup %s %d", name_buf, out->nodeid);
+ else
+ bpf_printk("post-lookup [name length under 4] %d", out->nodeid);
+ return BPF_FUSE_CONTINUE;
+}
+
+BPF_STRUCT_OPS(uint32_t, test_open_prefilter, const struct bpf_fuse_meta_info *meta,
+ struct fuse_open_in *in)
+{
+ int backing = BPF_FUSE_USER;
+
+ switch (meta->nodeid) {
+ case 5:
+ backing = BPF_FUSE_CONTINUE;
+ bpf_printk("Setting BPF_FUSE_CONTINUE:%d", BPF_FUSE_CONTINUE);
+ break;
+
+ case 6:
+ backing = BPF_FUSE_POSTFILTER;
+ bpf_printk("Setting BPF_FUSE_CONTINUE:%d", BPF_FUSE_POSTFILTER);
+ break;
+
+ default:
+ bpf_printk("Setting NOTHING %d", BPF_FUSE_USER);
+ break;
+ }
+
+ bpf_printk("open: %d %d", meta->nodeid, backing);
+ return backing;
+}
+
+BPF_STRUCT_OPS(uint32_t, test_open_postfilter, const struct bpf_fuse_meta_info *meta,
+ const struct fuse_open_in *in,
+ struct fuse_open_out *out)
+{
+ bpf_printk("open postfilter");
+ return BPF_FUSE_USER_POSTFILTER;
+}
+
+BPF_STRUCT_OPS(uint32_t, test_read_iter_prefilter, const struct bpf_fuse_meta_info *meta,
+ struct fuse_read_in *in)
+{
+ bpf_printk("read %llu %llu", in->fh, in->offset);
+ if (in->fh == 1 && in->offset == 0)
+ return BPF_FUSE_USER;
+ return BPF_FUSE_CONTINUE;
+}
+
+BPF_STRUCT_OPS(uint32_t, test_getattr_prefilter, const struct bpf_fuse_meta_info *meta,
+ struct fuse_getattr_in *in)
+{
+ /* real and partial use backing file */
+ int backing = BPF_FUSE_USER;
+
+ switch (meta->nodeid) {
+ case 1:
+ case 5:
+ case 6:
+ /*
+ * TODO: Find better solution
+ * Add 100 to stop clang compiling to jump table which bpf hates
+ */
+ case 100:
+ backing = BPF_FUSE_CONTINUE;
+ break;
+ }
+
+ bpf_printk("getattr %d %d", meta->nodeid, backing);
+ return backing;
+}
+
+BPF_STRUCT_OPS(uint32_t, test_setattr_prefilter, const struct bpf_fuse_meta_info *meta,
+ struct fuse_setattr_in *in)
+{
+ /* real and partial use backing file */
+ int backing = BPF_FUSE_USER;
+
+ switch (meta->nodeid) {
+ case 1:
+ case 5:
+ case 6:
+ /* TODO See above */
+ case 100:
+ backing = BPF_FUSE_CONTINUE;
+ break;
+ }
+
+ bpf_printk("setattr %d %d", meta->nodeid, backing);
+ return backing;
+}
+
+BPF_STRUCT_OPS(uint32_t, test_opendir_prefilter, const struct bpf_fuse_meta_info *meta,
+ struct fuse_open_in *in)
+{
+ int backing = BPF_FUSE_USER;
+
+ switch (meta->nodeid) {
+ case 1:
+ backing = BPF_FUSE_POSTFILTER;
+ break;
+ }
+ bpf_printk("opendir %d %d", meta->nodeid, backing);
+ return backing;
+}
+
+BPF_STRUCT_OPS(uint32_t, test_opendir_postfilter, const struct bpf_fuse_meta_info *meta,
+ const struct fuse_open_in *in,
+ struct fuse_open_out *out)
+{
+ out->fh = 2;
+ bpf_printk("opendir postfilter");
+ return BPF_FUSE_CONTINUE;
+}
+
+BPF_STRUCT_OPS(uint32_t, test_readdir_prefilter, const struct bpf_fuse_meta_info *meta,
+ struct fuse_read_in *in)
+{
+ int backing = BPF_FUSE_USER;
+
+ if (in->fh == 2)
+ backing = BPF_FUSE_POSTFILTER;
+
+ bpf_printk("readdir %d %d", in->fh, backing);
+ return backing;
+}
+
+BPF_STRUCT_OPS(uint32_t, test_readdir_postfilter, const struct bpf_fuse_meta_info *meta,
+ const struct fuse_read_in *in,
+ struct fuse_read_out *out, struct fuse_buffer *buffer)
+{
+ int backing = BPF_FUSE_CONTINUE;
+
+ if (in->fh == 2)
+ backing = BPF_FUSE_USER_POSTFILTER;
+
+ bpf_printk("readdir postfilter %d %d", in->fh, backing);
+ return backing;
+}
+
+// test_hidden
+
+BPF_STRUCT_OPS(uint32_t, hidden_lookup_prefilter, const struct bpf_fuse_meta_info *meta,
+ struct fuse_buffer *name)
+{
+ struct bpf_dynptr name_ptr;
+ char *name_buf;
+ bool backing = false;
+ int ret;
+
+ bpf_fuse_get_ro_dynptr(name, &name_ptr);
+
+ /* bpf_dynptr_slice will only return a pointer if the dynptr is long enough */
+ name_buf = bpf_dynptr_slice(&name_ptr, 0, NULL, 5);
+ if (name_buf)
+ bpf_printk("Lookup: %s", name_buf);
+ else
+ bpf_printk("lookup [name length under 4]");
+ if (name_buf) {
+ if (bpf_strncmp(name_buf, 5, "show") == 0)
+ return BPF_FUSE_CONTINUE;
+ if (bpf_strncmp(name_buf, 5, "hide") == 0)
+ return -ENOENT;
+ }
+
+ return BPF_FUSE_CONTINUE;
+}
+
+// test_error
+
+BPF_STRUCT_OPS(uint32_t, error_mkdir_prefilter, const struct bpf_fuse_meta_info *meta,
+ struct fuse_mkdir_in *in, struct fuse_buffer *name)
+{
+ bpf_printk("mkdir");
+
+ return BPF_FUSE_POSTFILTER;
+}
+
+BPF_STRUCT_OPS(uint32_t, error_mkdir_postfilter, const struct bpf_fuse_meta_info *meta,
+ const struct fuse_mkdir_in *in, const struct fuse_buffer *name)
+{
+ bpf_printk("mkdir postfilter");
+
+ if (meta->error_in == -EEXIST)
+ return -EPERM;
+ return 0;
+}
+
+BPF_STRUCT_OPS(uint32_t, error_lookup_prefilter, const struct bpf_fuse_meta_info *meta,
+ struct fuse_buffer *name)
+{
+ struct bpf_dynptr name_ptr;
+ char *name_buf;
+ bool backing = false;
+ int ret;
+
+ bpf_fuse_get_ro_dynptr(name, &name_ptr);
+
+ /* bpf_dynptr_slice will only return a pointer if the dynptr is long enough */
+ name_buf = bpf_dynptr_slice(&name_ptr, 0, NULL, 1);
+ bpf_printk("lookup prefilter %s", name);
+ return BPF_FUSE_POSTFILTER;
+}
+
+BPF_STRUCT_OPS(uint32_t, error_lookup_postfilter, const struct bpf_fuse_meta_info *meta,
+ const struct fuse_buffer *name,
+ struct fuse_entry_out *out, struct fuse_buffer *entries)
+{
+ struct bpf_dynptr name_ptr;
+ char *name_buf;
+
+ bpf_fuse_get_ro_dynptr(name, &name_ptr);
+ name_buf = bpf_dynptr_slice(&name_ptr, 0, NULL, 13);
+ if (name_buf)
+ bpf_printk("post-lookup %s %d", name_buf, out->nodeid);
+ else
+ bpf_printk("post-lookup [name length under 13] %d", out->nodeid);
+ if (name_buf) {
+ if (bpf_strncmp(name_buf, 13, "doesnotexist") == 0) {
+ bpf_printk("lookup postfilter doesnotexist");
+ return BPF_FUSE_USER_POSTFILTER;
+ }
+ }
+
+ return 0;
+}
+
+// test readdirplus
+
+BPF_STRUCT_OPS(uint32_t, readdirplus_readdir_prefilter, const struct bpf_fuse_meta_info *meta,
+ struct fuse_read_in *in)
+{
+ return BPF_FUSE_USER;
+}
+
+// Test passthrough
+
+// Reuse error_lookup_prefilter
+
+BPF_STRUCT_OPS(uint32_t, passthrough_lookup_postfilter, const struct bpf_fuse_meta_info *meta,
+ const struct fuse_buffer *name,
+ struct fuse_entry_out *out, struct fuse_buffer *entries)
+{
+ struct bpf_dynptr name_ptr;
+ struct bpf_dynptr entries_ptr;
+ char *name_buf;
+ struct fuse_bpf_entry_out entry;
+
+ bpf_fuse_get_ro_dynptr(name, &name_ptr);
+ name_buf = bpf_dynptr_slice(&name_ptr, 0, NULL, 1);
+ if (name_buf)
+ bpf_printk("post-lookup %s %d", name_buf, out->nodeid);
+ else
+ bpf_printk("post-lookup [name length under 1???] %d", out->nodeid);
+ bpf_fuse_get_rw_dynptr(entries, &entries_ptr, sizeof(entry), false);
+ entry = (struct fuse_bpf_entry_out) {
+ .entry_type = FUSE_ENTRY_REMOVE_BPF,
+ };
+ bpf_dynptr_write(&entries_ptr, 0, &entry, sizeof(entry), 0);
+
+ return BPF_FUSE_USER_POSTFILTER;
+}
+
+// lookup_postfilter_ops
+
+//reuse error_lookup_prefilter
+
+BPF_STRUCT_OPS(uint32_t, test_bpf_lookup_postfilter, const struct bpf_fuse_meta_info *meta,
+ const struct fuse_buffer *name,
+ struct fuse_entry_out *out, struct fuse_buffer *entries)
+{
+ return BPF_FUSE_USER_POSTFILTER;
+}
+
+SEC(".struct_ops")
+struct fuse_ops trace_ops = {
+ .open_prefilter = (void *)trace_open_prefilter,
+ .opendir_prefilter = (void *)trace_opendir_prefilter,
+ .create_open_prefilter = (void *)trace_create_open_prefilter,
+ .release_prefilter = (void *)trace_release_prefilter,
+ .releasedir_prefilter = (void *)trace_releasedir_prefilter,
+ .flush_prefilter = (void *)trace_flush_prefilter,
+ .lseek_prefilter = (void *)trace_lseek_prefilter,
+ //.copy_file_range_prefilter = (void *)trace_copy_file_range_prefilter,
+ //.fsync_prefilter = (void *)trace_fsync_prefilter,
+ //.dir_fsync_prefilter = (void *)trace_dir_fsync_prefilter,
+ .getxattr_prefilter = (void *)trace_getxattr_prefilter,
+ .listxattr_prefilter = (void *)trace_listxattr_prefilter,
+ .setxattr_prefilter = (void *)trace_setxattr_prefilter,
+ .removexattr_prefilter = (void *)trace_removexattr_prefilter,
+ .read_iter_prefilter = (void *)trace_read_iter_prefilter,
+ .write_iter_prefilter = (void *)trace_write_iter_prefilter,
+ .file_fallocate_prefilter = (void *)trace_file_fallocate_prefilter,
+ .lookup_prefilter = (void *)trace_lookup_prefilter,
+ .mknod_prefilter = (void *)trace_mknod_prefilter,
+ .mkdir_prefilter = (void *)trace_mkdir_prefilter,
+ .rmdir_prefilter = (void *)trace_rmdir_prefilter,
+ .rename2_prefilter = (void *)trace_rename2_prefilter,
+ .rename_prefilter = (void *)trace_rename_prefilter,
+ .unlink_prefilter = (void *)trace_unlink_prefilter,
+ .link_prefilter = (void *)trace_link_prefilter,
+ .getattr_prefilter = (void *)trace_getattr_prefilter,
+ .setattr_prefilter = (void *)trace_setattr_prefilter,
+ .statfs_prefilter = (void *)trace_statfs_prefilter,
+ .get_link_prefilter = (void *)trace_get_link_prefilter,
+ .symlink_prefilter = (void *)trace_symlink_prefilter,
+ .readdir_prefilter = (void *)trace_readdir_prefilter,
+ .access_prefilter = (void *)trace_access_prefilter,
+ .name = "trace_ops",
+};
+
+SEC(".struct_ops")
+struct fuse_ops test_trace_ops = {
+ .open_prefilter = (void *)test_open_prefilter,
+ .open_postfilter = (void *)test_open_postfilter,
+ .opendir_prefilter = (void *)test_opendir_prefilter,
+ .opendir_postfilter = (void *)test_opendir_postfilter,
+ .create_open_prefilter = (void *)trace_create_open_prefilter,
+ .release_prefilter = (void *)trace_release_prefilter,
+ .releasedir_prefilter = (void *)trace_releasedir_prefilter,
+ .flush_prefilter = (void *)trace_flush_prefilter,
+ .lseek_prefilter = (void *)trace_lseek_prefilter,
+ //.copy_file_range_prefilter = (void *)trace_copy_file_range_prefilter,
+ //.fsync_prefilter = (void *)trace_fsync_prefilter,
+ //.dir_fsync_prefilter = (void *)trace_dir_fsync_prefilter,
+ .getxattr_prefilter = (void *)trace_getxattr_prefilter,
+ .listxattr_prefilter = (void *)trace_listxattr_prefilter,
+ .setxattr_prefilter = (void *)trace_setxattr_prefilter,
+ .removexattr_prefilter = (void *)trace_removexattr_prefilter,
+ .read_iter_prefilter = (void *)test_read_iter_prefilter,
+ .write_iter_prefilter = (void *)trace_write_iter_prefilter,
+ .file_fallocate_prefilter = (void *)trace_file_fallocate_prefilter,
+ .lookup_prefilter = (void *)test_lookup_prefilter,
+ .lookup_postfilter = (void *)test_lookup_postfilter,
+ .mknod_prefilter = (void *)trace_mknod_prefilter,
+ .mkdir_prefilter = (void *)trace_mkdir_prefilter,
+ .rmdir_prefilter = (void *)trace_rmdir_prefilter,
+ .rename2_prefilter = (void *)trace_rename2_prefilter,
+ .rename_prefilter = (void *)trace_rename_prefilter,
+ .unlink_prefilter = (void *)trace_unlink_prefilter,
+ .link_prefilter = (void *)trace_link_prefilter,
+ .getattr_prefilter = (void *)test_getattr_prefilter,
+ .setattr_prefilter = (void *)test_setattr_prefilter,
+ .statfs_prefilter = (void *)trace_statfs_prefilter,
+ .get_link_prefilter = (void *)trace_get_link_prefilter,
+ .symlink_prefilter = (void *)trace_symlink_prefilter,
+ .readdir_prefilter = (void *)test_readdir_prefilter,
+ .readdir_postfilter = (void *)test_readdir_postfilter,
+ .access_prefilter = (void *)trace_access_prefilter,
+ .name = "test_trace_ops",
+};
+
+SEC(".struct_ops")
+struct fuse_ops readdir_redact_ops = {
+ .readdir_prefilter = (void *)readdir_redact_prefilter,
+ .readdir_postfilter = (void *)readdir_redact_postfilter,
+ .name = "readdir_redact",
+};
+
+SEC(".struct_ops")
+struct fuse_ops test_hidden_ops = {
+ .lookup_prefilter = (void *)hidden_lookup_prefilter,
+ .access_prefilter = (void *)trace_access_prefilter,
+ .create_open_prefilter = (void *)trace_create_open_prefilter,
+ .name = "test_hidden",
+};
+
+SEC(".struct_ops")
+struct fuse_ops test_error_ops = {
+ .lookup_prefilter = (void *)error_lookup_prefilter,
+ .lookup_postfilter = (void *)error_lookup_postfilter,
+ .mkdir_prefilter = (void *)error_mkdir_prefilter,
+ .mkdir_postfilter = (void *)error_mkdir_postfilter,
+ .name = "test_error",
+};
+
+SEC(".struct_ops")
+struct fuse_ops readdir_plus_ops = {
+ .readdir_prefilter = (void *)readdirplus_readdir_prefilter,
+ .name = "readdir_plus",
+};
+
+SEC(".struct_ops")
+struct fuse_ops passthrough_ops = {
+ .lookup_prefilter = (void *)error_lookup_prefilter,
+ .lookup_postfilter = (void *)passthrough_lookup_postfilter,
+ .name = "passthrough",
+};
+
+SEC(".struct_ops")
+struct fuse_ops lookup_postfilter_ops = {
+ .lookup_prefilter = (void *)error_lookup_prefilter,
+ .lookup_postfilter = (void *)test_bpf_lookup_postfilter,
+ .name = "lookup_post",
+};
+
+#if 0
+//TODO: Figure out what to do with these
+SEC("test_verify")
+
+int verify_test(struct __bpf_fuse_args *fa)
+{
+ if (fa->opcode == (FUSE_MKDIR | FUSE_PREFILTER)) {
+ const char *start;
+ const char *end;
+ const struct fuse_mkdir_in *in;
+
+ start = (void *)(long) fa->in_args[0].value;
+ end = (void *)(long) fa->in_args[0].end_offset;
+ if (start + sizeof(*in) <= end) {
+ in = (struct fuse_mkdir_in *)(start);
+ bpf_printk("test1: %d %d", in->mode, in->umask);
+ }
+
+ return BPF_FUSE_CONTINUE;
+ }
+ return BPF_FUSE_CONTINUE;
+}
+
+SEC("test_verify_fail")
+
+int verify_fail_test(struct __bpf_fuse_args *fa)
+{
+ struct t {
+ uint32_t a;
+ uint32_t b;
+ char d[];
+ };
+ if (fa->opcode == (FUSE_MKDIR | FUSE_PREFILTER)) {
+ const char *start;
+ const char *end;
+ const struct t *c;
+
+ start = (void *)(long) fa->in_args[0].value;
+ end = (void *)(long) fa->in_args[0].end_offset;
+ if (start + sizeof(struct t) <= end) {
+ c = (struct t *)start;
+ bpf_printk("test1: %d %d %d", c->a, c->b, c->d[0]);
+ }
+ return BPF_FUSE_CONTINUE;
+ }
+ return BPF_FUSE_CONTINUE;
+}
+
+SEC("test_verify_fail2")
+
+int verify_fail_test2(struct __bpf_fuse_args *fa)
+{
+ if (fa->opcode == (FUSE_MKDIR | FUSE_PREFILTER)) {
+ const char *start;
+ const char *end;
+ struct fuse_mkdir_in *c;
+
+ start = (void *)(long) fa->in_args[0].value;
+ end = (void *)(long) fa->in_args[1].end_offset;
+ if (start + sizeof(*c) <= end) {
+ c = (struct fuse_mkdir_in *)start;
+ bpf_printk("test1: %d %d", c->mode, c->umask);
+ }
+ return BPF_FUSE_CONTINUE;
+ }
+ return BPF_FUSE_CONTINUE;
+}
+
+SEC("test_verify_fail3")
+/* Cannot write directly to fa */
+int verify_fail_test3(struct __bpf_fuse_args *fa)
+{
+ if (fa->opcode == (FUSE_LOOKUP | FUSE_POSTFILTER)) {
+ const char *name = (void *)(long)fa->in_args[0].value;
+ const char *end = (void *)(long)fa->in_args[0].end_offset;
+ struct fuse_entry_out *feo = fa_verify_out(fa, 0, sizeof(*feo));
+
+ if (!feo)
+ return -1;
+
+ if (strcmp_check("real", name, end) == 0)
+ feo->nodeid = 5;
+ else if (strcmp_check("partial", name, end) == 0)
+ feo->nodeid = 6;
+
+ bpf_printk("post-lookup %s %d", name, feo->nodeid);
+ return BPF_FUSE_CONTINUE;
+ }
+ return BPF_FUSE_CONTINUE;
+}
+
+SEC("test_verify_fail4")
+/* Cannot write outside of requested area */
+int verify_fail_test4(struct __bpf_fuse_args *fa)
+{
+ if (fa->opcode == (FUSE_LOOKUP | FUSE_POSTFILTER)) {
+ const char *name = (void *)(long)fa->in_args[0].value;
+ const char *end = (void *)(long)fa->in_args[0].end_offset;
+ struct fuse_entry_out *feo = bpf_make_writable_out(fa, 0, fa->out_args[0].value,
+ 1, true);
+
+ if (!feo)
+ return -1;
+
+ if (strcmp_check("real", name, end) == 0)
+ feo->nodeid = 5;
+ else if (strcmp_check("partial", name, end) == 0)
+ feo->nodeid = 6;
+
+ bpf_printk("post-lookup %s %d", name, feo->nodeid);
+ return BPF_FUSE_CONTINUE;
+ }
+ return BPF_FUSE_CONTINUE;
+}
+
+SEC("test_verify_fail5")
+/* Cannot use old verification after requesting writable */
+int verify_fail_test5(struct __bpf_fuse_args *fa)
+{
+ if (fa->opcode == (FUSE_LOOKUP | FUSE_POSTFILTER)) {
+ struct fuse_entry_out *feo;
+ struct fuse_entry_out *feo_w;
+
+ feo = fa_verify_out(fa, 0, sizeof(*feo));
+ if (!feo)
+ return -1;
+
+ feo_w = bpf_make_writable_out(fa, 0, fa->out_args[0].value, sizeof(*feo_w), true);
+ bpf_printk("post-lookup %d", feo->nodeid);
+ if (!feo_w)
+ return -1;
+
+ feo_w->nodeid = 5;
+
+ return BPF_FUSE_CONTINUE;
+ }
+ return BPF_FUSE_CONTINUE;
+}
+
+SEC("test_verify5")
+/* Can use new verification after requesting writable */
+int verify_pass_test5(struct __bpf_fuse_args *fa)
+{
+ if (fa->opcode == (FUSE_LOOKUP | FUSE_POSTFILTER)) {
+ struct fuse_entry_out *feo;
+ struct fuse_entry_out *feo_w;
+
+ feo = fa_verify_out(fa, 0, sizeof(*feo));
+ if (!feo)
+ return -1;
+
+ bpf_printk("post-lookup %d", feo->nodeid);
+
+ feo_w = bpf_make_writable_out(fa, 0, fa->out_args[0].value, sizeof(*feo_w), true);
+
+ feo = fa_verify_out(fa, 0, sizeof(*feo));
+ if (feo)
+ bpf_printk("post-lookup %d", feo->nodeid);
+ if (!feo_w)
+ return -1;
+
+ feo_w->nodeid = 5;
+
+ return BPF_FUSE_CONTINUE;
+ }
+ return BPF_FUSE_CONTINUE;
+}
+
+SEC("test_verify_fail6")
+/* Reading context from a nonsense offset is not allowed */
+int verify_pass_test6(struct __bpf_fuse_args *fa)
+{
+ char *nonsense = (char *)fa;
+
+ bpf_printk("post-lookup %d", nonsense[1]);
+
+ return BPF_FUSE_CONTINUE;
+}
+#endif
diff --git a/tools/testing/selftests/filesystems/fuse/test_framework.h b/tools/testing/selftests/filesystems/fuse/test_framework.h
new file mode 100644
index 000000000000..24896b5e172f
--- /dev/null
+++ b/tools/testing/selftests/filesystems/fuse/test_framework.h
@@ -0,0 +1,172 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright (c) 2021 Google LLC
+ */
+
+#ifndef _TEST_FRAMEWORK_H
+#define _TEST_FRAMEWORK_H
+
+#include <stdbool.h>
+#include <stdio.h>
+
+#ifdef __ANDROID__
+static int test_case_pass;
+static int test_case_fail;
+#define ksft_print_msg printf
+#define ksft_test_result_pass(...) ({test_case_pass++; printf(__VA_ARGS__); })
+#define ksft_test_result_fail(...) ({test_case_fail++; printf(__VA_ARGS__); })
+#define ksft_exit_fail_msg(...) printf(__VA_ARGS__)
+#define ksft_print_header()
+#define ksft_set_plan(cnt)
+#define ksft_get_fail_cnt() test_case_fail
+#define ksft_exit_pass() 0
+#define ksft_exit_fail() 1
+#else
+#include <kselftest.h>
+#endif
+
+#define TEST_FAILURE 1
+#define TEST_SUCCESS 0
+
+#define ptr_to_u64(p) ((__u64)p)
+
+#if __BYTE_ORDER__ == __ORDER_LITTLE_ENDIAN__
+#define le16_to_cpu(x) (x)
+#define le32_to_cpu(x) (x)
+#define le64_to_cpu(x) (x)
+#else
+#error Big endian not supported!
+#endif
+
+struct _test_options {
+ int file;
+ bool verbose;
+};
+
+extern struct _test_options test_options;
+
+#define TESTCOND(condition) \
+ do { \
+ if (!(condition)) { \
+ ksft_print_msg("%s failed %d\n", \
+ __func__, __LINE__); \
+ goto out; \
+ } else if (test_options.verbose) \
+ ksft_print_msg("%s succeeded %d\n", \
+ __func__, __LINE__); \
+ } while (false)
+
+#define TESTCONDERR(condition) \
+ do { \
+ if (!(condition)) { \
+ ksft_print_msg("%s failed %d\n", \
+ __func__, __LINE__); \
+ ksft_print_msg("Error %d (\"%s\")\n", \
+ errno, strerror(errno)); \
+ goto out; \
+ } else if (test_options.verbose) \
+ ksft_print_msg("%s succeeded %d\n", \
+ __func__, __LINE__); \
+ } while (false)
+
+#define TEST(statement, condition) \
+ do { \
+ statement; \
+ TESTCOND(condition); \
+ } while (false)
+
+#define TESTERR(statement, condition) \
+ do { \
+ statement; \
+ TESTCONDERR(condition); \
+ } while (false)
+
+enum _operator {
+ _eq,
+ _ne,
+ _ge,
+};
+
+static const char * const _operator_name[] = {
+ "==",
+ "!=",
+ ">=",
+};
+
+#define _TEST_OPERATOR(name, _type, format_specifier) \
+static inline int _test_operator_##name(const char *func, int line, \
+ _type a, _type b, enum _operator o) \
+{ \
+ bool pass; \
+ switch (o) { \
+ case _eq: pass = a == b; break; \
+ case _ne: pass = a != b; break; \
+ case _ge: pass = a >= b; break; \
+ } \
+ \
+ if (!pass) \
+ ksft_print_msg("Failed: %s at line %d, " \
+ format_specifier " %s " \
+ format_specifier "\n", \
+ func, line, a, _operator_name[o], b); \
+ else if (test_options.verbose) \
+ ksft_print_msg("Passed: %s at line %d, " \
+ format_specifier " %s " \
+ format_specifier "\n", \
+ func, line, a, _operator_name[o], b); \
+ \
+ return pass ? TEST_SUCCESS : TEST_FAILURE; \
+}
+
+_TEST_OPERATOR(i, int, "%d")
+_TEST_OPERATOR(ui, unsigned int, "%u")
+_TEST_OPERATOR(lui, unsigned long, "%lu")
+_TEST_OPERATOR(ss, ssize_t, "%zd")
+_TEST_OPERATOR(vp, void *, "%px")
+_TEST_OPERATOR(cp, char *, "%px")
+
+#define _CALL_TO(_type, name, a, b, o) \
+ _type:_test_operator_##name(__func__, __LINE__, \
+ (_type) (long long) (a), \
+ (_type) (long long) (b), o)
+
+#define TESTOPERATOR(a, b, o) \
+ do { \
+ if (_Generic((a), \
+ _CALL_TO(int, i, a, b, o), \
+ _CALL_TO(unsigned int, ui, a, b, o), \
+ _CALL_TO(unsigned long, lui, a, b, o), \
+ _CALL_TO(ssize_t, ss, a, b, o), \
+ _CALL_TO(void *, vp, a, b, o), \
+ _CALL_TO(char *, cp, a, b, o) \
+ )) \
+ goto out; \
+ } while (false)
+
+#define TESTEQUAL(a, b) TESTOPERATOR(a, b, _eq)
+#define TESTNE(a, b) TESTOPERATOR(a, b, _ne)
+#define TESTGE(a, b) TESTOPERATOR(a, b, _ge)
+
+/* For testing a syscall that returns 0 on success and sets errno otherwise */
+#define TESTSYSCALL(statement) TESTCONDERR((statement) == 0)
+
+static inline void print_bytes(const void *data, size_t size)
+{
+ const char *bytes = data;
+ int i;
+
+ for (i = 0; i < size; ++i) {
+ if (i % 0x10 == 0)
+ printf("%08x:", i);
+ printf("%02x ", (unsigned int) (unsigned char) bytes[i]);
+ if (i % 0x10 == 0x0f)
+ printf("\n");
+ }
+
+ if (i % 0x10 != 0)
+ printf("\n");
+}
+
+
+
+#endif
diff --git a/tools/testing/selftests/filesystems/fuse/test_fuse.h b/tools/testing/selftests/filesystems/fuse/test_fuse.h
new file mode 100644
index 000000000000..ca22b26775a0
--- /dev/null
+++ b/tools/testing/selftests/filesystems/fuse/test_fuse.h
@@ -0,0 +1,494 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright (c) 2021 Google LLC
+ */
+
+#ifndef TEST_FUSE__H
+#define TEST_FUSE__H
+
+#define _GNU_SOURCE
+
+#include "test_framework.h"
+
+#include <dirent.h>
+#include <sys/stat.h>
+#include <sys/statfs.h>
+#include <sys/types.h>
+
+#include <uapi/linux/fuse.h>
+
+#define PAGE_SIZE 4096
+#define FUSE_POSTFILTER 0x20000
+
+extern struct _test_options test_options;
+
+/* Slow but semantically easy string functions */
+
+/*
+ * struct s just wraps a char pointer
+ * It is a pointer to a malloc'd string, or null
+ * All consumers handle null input correctly
+ * All consumers free the string
+ */
+struct s {
+ char *s;
+};
+
+struct s s(const char *s1);
+struct s sn(const char *s1, const char *s2);
+int s_cmp(struct s s1, struct s s2);
+struct s s_cat(struct s s1, struct s s2);
+struct s s_splitleft(struct s s1, char c);
+struct s s_splitright(struct s s1, char c);
+struct s s_word(struct s s1, char c, size_t n);
+struct s s_path(struct s s1, struct s s2);
+struct s s_pathn(size_t n, struct s s1, ...);
+int s_link(struct s src_pathname, struct s dst_pathname);
+int s_symlink(struct s src_pathname, struct s dst_pathname);
+int s_mkdir(struct s pathname, mode_t mode);
+int s_rmdir(struct s pathname);
+int s_unlink(struct s pathname);
+int s_open(struct s pathname, int flags, ...);
+int s_openat(int dirfd, struct s pathname, int flags, ...);
+int s_creat(struct s pathname, mode_t mode);
+int s_mkfifo(struct s pathname, mode_t mode);
+int s_stat(struct s pathname, struct stat *st);
+int s_statfs(struct s pathname, struct statfs *st);
+int s_fuse_attr(struct s pathname, struct fuse_attr *fuse_attr_out);
+DIR *s_opendir(struct s pathname);
+int s_getxattr(struct s pathname, const char name[], void *value, size_t size,
+ ssize_t *ret_size);
+int s_listxattr(struct s pathname, void *list, size_t size, ssize_t *ret_size);
+int s_setxattr(struct s pathname, const char name[], const void *value,
+ size_t size, int flags);
+int s_removexattr(struct s pathname, const char name[]);
+int s_rename(struct s oldpathname, struct s newpathname);
+
+struct s tracing_folder(void);
+int tracing_on(void);
+
+char *concat_file_name(const char *dir, const char *file);
+char *setup_mount_dir(const char *name);
+int delete_dir_tree(const char *dir_path, bool remove_root);
+
+#define TESTFUSEINNULL(_opcode) \
+ do { \
+ struct fuse_in_header *in_header = \
+ (struct fuse_in_header *)bytes_in; \
+ ssize_t res = read(fuse_dev, &bytes_in, \
+ sizeof(bytes_in)); \
+ \
+ TESTEQUAL(in_header->opcode, _opcode); \
+ TESTEQUAL(res, sizeof(*in_header)); \
+ } while (false)
+
+static inline void print_header(struct fuse_in_header *header)
+{
+ printf("~~HEADER~~");
+ printf("len:\t%d\n", header->len);
+ printf("opcode:\t%d\n", header->opcode);
+ printf("unique:\t%ld\n", header->unique);
+ printf("nodeid:\t%ld\n", header->nodeid);
+ printf("uid:\t%d\n", header->uid);
+ printf("gid:\t%d\n", header->gid);
+ printf("pid:\t%d\n", header->pid);
+ printf("total_extlen:\t%d\n", header->total_extlen);
+ printf("padding:\t%d\n", header->padding);
+}
+
+static inline int test_fuse_in(int fuse_dev, uint8_t *bytes_in, int opcode, int size)
+{
+ struct fuse_in_header *in_header =
+ (struct fuse_in_header *)bytes_in;
+ ssize_t res = read(fuse_dev, &bytes_in,
+ sizeof(bytes_in));
+
+ TESTEQUAL(res, sizeof(*in_header) + size);
+ TESTEQUAL(in_header->opcode, opcode);
+ return 0;
+out:
+ return -1;
+}
+
+#define ERR_IN_EXT_LEN (FUSE_REC_ALIGN(sizeof(struct fuse_ext_header) + sizeof(uint32_t)))
+
+#define TESTFUSEIN(_opcode, in_struct) \
+ do { \
+ struct fuse_in_header *in_header = \
+ (struct fuse_in_header *)bytes_in; \
+ ssize_t res = read(fuse_dev, &bytes_in, \
+ sizeof(bytes_in)); \
+ \
+ TESTEQUAL(res, sizeof(*in_header) + sizeof(*in_struct));\
+ TESTEQUAL(in_header->opcode, _opcode); \
+ in_struct = (void *)(bytes_in + sizeof(*in_header)); \
+ } while (false)
+
+#define TESTFUSEIN_ERR_IN(_opcode, in_struct, err_in) \
+ do { \
+ struct fuse_in_header *in_header = \
+ (struct fuse_in_header *)bytes_in; \
+ struct fuse_ext_header *ext_h; \
+ ssize_t res = read(fuse_dev, &bytes_in, \
+ sizeof(bytes_in)); \
+ \
+ TESTEQUAL(res, sizeof(*in_header) + sizeof(*in_struct) \
+ + ERR_IN_EXT_LEN); \
+ TESTEQUAL(in_header->opcode, _opcode); \
+ in_struct = (void *)(bytes_in + sizeof(*in_header)); \
+ ext_h = (void *)&bytes_in[in_header->len \
+ - in_header->total_extlen * 8]; \
+ err_in = (void *)&ext_h[1]; \
+ } while (false)
+
+#define TESTFUSEIN2(_opcode, in_struct1, in_struct2) \
+ do { \
+ struct fuse_in_header *in_header = \
+ (struct fuse_in_header *)bytes_in; \
+ ssize_t res = read(fuse_dev, &bytes_in, \
+ sizeof(bytes_in)); \
+ \
+ TESTEQUAL(res, sizeof(*in_header) + sizeof(*in_struct1) \
+ + sizeof(*in_struct2)); \
+ TESTEQUAL(in_header->opcode, _opcode); \
+ in_struct1 = (void *)(bytes_in + sizeof(*in_header)); \
+ in_struct2 = (void *)(bytes_in + sizeof(*in_header) \
+ + sizeof(*in_struct1)); \
+ } while (false)
+
+#define TESTFUSEIN2_ERR_IN(_opcode, in_struct1, in_struct2, err_in) \
+ do { \
+ struct fuse_in_header *in_header = \
+ (struct fuse_in_header *)bytes_in; \
+ struct fuse_ext_header *ext_h; \
+ ssize_t res = read(fuse_dev, &bytes_in, \
+ sizeof(bytes_in)); \
+ \
+ TESTEQUAL(res, sizeof(*in_header) + sizeof(*in_struct1) \
+ + sizeof(*in_struct2) \
+ + ERR_IN_EXT_LEN); \
+ TESTEQUAL(in_header->opcode, _opcode); \
+ in_struct1 = (void *)(bytes_in + sizeof(*in_header)); \
+ in_struct2 = (void *)(bytes_in + sizeof(*in_header) \
+ + sizeof(*in_struct1)); \
+ ext_h = (void *)&bytes_in[in_header->len \
+ - in_header->total_extlen * 8]; \
+ err_in = (void *)&ext_h[1]; \
+ } while (false)
+
+#define TESTFUSEINEXT(_opcode, in_struct, extra) \
+ do { \
+ struct fuse_in_header *in_header = \
+ (struct fuse_in_header *)bytes_in; \
+ ssize_t res = read(fuse_dev, &bytes_in, \
+ sizeof(bytes_in)); \
+ \
+ TESTEQUAL(in_header->opcode, _opcode); \
+ TESTEQUAL(res, \
+ sizeof(*in_header) + sizeof(*in_struct) + extra);\
+ in_struct = (void *)(bytes_in + sizeof(*in_header)); \
+ } while (false)
+
+#define TESTFUSEINUNKNOWN() \
+ do { \
+ struct fuse_in_header *in_header = \
+ (struct fuse_in_header *)bytes_in; \
+ ssize_t res = read(fuse_dev, &bytes_in, \
+ sizeof(bytes_in)); \
+ \
+ TESTGE(res, sizeof(*in_header)); \
+ TESTEQUAL(in_header->opcode, -1); \
+ } while (false)
+
+/* Special case lookup since it is asymmetric */
+#define TESTFUSELOOKUP(expected, filter) \
+ do { \
+ struct fuse_in_header *in_header = \
+ (struct fuse_in_header *)bytes_in; \
+ char *name = (char *) (bytes_in + sizeof(*in_header)); \
+ ssize_t res; \
+ \
+ TEST(res = read(fuse_dev, &bytes_in, sizeof(bytes_in)), \
+ res != -1); \
+ /* TODO once we handle forgets properly, remove */ \
+ if (in_header->opcode == FUSE_FORGET) \
+ continue; \
+ if (in_header->opcode == FUSE_BATCH_FORGET) \
+ continue; \
+ TESTGE(res, sizeof(*in_header)); \
+ TESTEQUAL(in_header->opcode, \
+ FUSE_LOOKUP | filter); \
+ /* Post filter only recieves fuse_bpf_entry_out if it's \
+ * filled in. TODO: Should we populate this for user \
+ * postfilter, and if so, how to handle backing? */ \
+ TESTEQUAL(res, \
+ sizeof(*in_header) + strlen(expected) + 1 + \
+ (filter == FUSE_POSTFILTER ? \
+ sizeof(struct fuse_entry_out) + \
+ sizeof(struct fuse_bpf_entry_out) * 0 + \
+ ERR_IN_EXT_LEN: 0)); \
+ TESTCOND(!strcmp(name, expected)); \
+ break; \
+ } while (true)
+
+/* Special case lookup since it is asymmetric */
+#define TESTFUSELOOKUP_POST_ERRIN(expected, err_in) \
+ do { \
+ struct fuse_in_header *in_header = \
+ (struct fuse_in_header *)bytes_in; \
+ struct fuse_ext_header *ext_h; \
+ char *name = (char *) (bytes_in + sizeof(*in_header)); \
+ ssize_t res; \
+ \
+ TEST(res = read(fuse_dev, &bytes_in, sizeof(bytes_in)), \
+ res != -1); \
+ /* TODO once we handle forgets properly, remove */ \
+ if (in_header->opcode == FUSE_FORGET) \
+ continue; \
+ if (in_header->opcode == FUSE_BATCH_FORGET) \
+ continue; \
+ TESTGE(res, sizeof(*in_header)); \
+ TESTEQUAL(in_header->opcode, \
+ FUSE_LOOKUP | FUSE_POSTFILTER); \
+ /* Post filter only recieves fuse_bpf_entry_out if it's \
+ * filled in. TODO: Should we populate this for user \
+ * postfilter, and if so, how to handle backing? */ \
+ TESTEQUAL(res, \
+ sizeof(*in_header) + strlen(expected) + 1 + \
+ sizeof(struct fuse_entry_out) + \
+ sizeof(struct fuse_bpf_entry_out) * 0 + \
+ ERR_IN_EXT_LEN); \
+ TESTCOND(!strcmp(name, expected)); \
+ \
+ ext_h = (void *)&bytes_in[in_header->len \
+ - in_header->total_extlen * 8]; \
+ err_in = (void *)&ext_h[1]; \
+ break; \
+ } while (true)
+
+#define TESTFUSEOUTEMPTY() \
+ do { \
+ struct fuse_in_header *in_header = \
+ (struct fuse_in_header *)bytes_in; \
+ struct fuse_out_header *out_header = \
+ (struct fuse_out_header *)bytes_out; \
+ \
+ *out_header = (struct fuse_out_header) { \
+ .len = sizeof(*out_header), \
+ .unique = in_header->unique, \
+ }; \
+ TESTEQUAL(write(fuse_dev, bytes_out, out_header->len), \
+ out_header->len); \
+ } while (false)
+
+#define TESTFUSEOUTERROR(errno) \
+ do { \
+ struct fuse_in_header *in_header = \
+ (struct fuse_in_header *)bytes_in; \
+ struct fuse_out_header *out_header = \
+ (struct fuse_out_header *)bytes_out; \
+ \
+ *out_header = (struct fuse_out_header) { \
+ .len = sizeof(*out_header), \
+ .error = errno, \
+ .unique = in_header->unique, \
+ }; \
+ TESTEQUAL(write(fuse_dev, bytes_out, out_header->len), \
+ out_header->len); \
+ } while (false)
+
+#define TESTFUSEOUTREAD(data, length) \
+ do { \
+ struct fuse_in_header *in_header = \
+ (struct fuse_in_header *)bytes_in; \
+ struct fuse_out_header *out_header = \
+ (struct fuse_out_header *)bytes_out; \
+ \
+ *out_header = (struct fuse_out_header) { \
+ .len = sizeof(*out_header) + length, \
+ .unique = in_header->unique, \
+ }; \
+ memcpy(bytes_out + sizeof(*out_header), data, length); \
+ TESTEQUAL(write(fuse_dev, bytes_out, out_header->len), \
+ out_header->len); \
+ } while (false)
+
+#define TESTFUSEDIROUTREAD(read_out, data, length) \
+ do { \
+ struct fuse_in_header *in_header = \
+ (struct fuse_in_header *)bytes_in; \
+ struct fuse_out_header *out_header = \
+ (struct fuse_out_header *)bytes_out; \
+ \
+ *out_header = (struct fuse_out_header) { \
+ .len = sizeof(*out_header) + \
+ sizeof(*read_out) + length, \
+ .unique = in_header->unique, \
+ }; \
+ memcpy(bytes_out + sizeof(*out_header) + \
+ sizeof(*read_out), data, length); \
+ memcpy(bytes_out + sizeof(*out_header), \
+ read_out, sizeof(*read_out)); \
+ TESTEQUAL(write(fuse_dev, bytes_out, out_header->len), \
+ out_header->len); \
+ } while (false)
+
+#define TESTFUSEOUT1(type1, obj1) \
+ do { \
+ *(struct fuse_out_header *) bytes_out \
+ = (struct fuse_out_header) { \
+ .len = sizeof(struct fuse_out_header) \
+ + sizeof(struct type1), \
+ .unique = ((struct fuse_in_header *) \
+ bytes_in)->unique, \
+ }; \
+ *(struct type1 *) (bytes_out \
+ + sizeof(struct fuse_out_header)) \
+ = obj1; \
+ TESTEQUAL(write(fuse_dev, bytes_out, \
+ ((struct fuse_out_header *)bytes_out)->len), \
+ ((struct fuse_out_header *)bytes_out)->len); \
+ } while (false)
+
+#define SETFUSEOUT2(type1, obj1, type2, obj2) \
+ do { \
+ *(struct fuse_out_header *) bytes_out \
+ = (struct fuse_out_header) { \
+ .len = sizeof(struct fuse_out_header) \
+ + sizeof(struct type1) \
+ + sizeof(struct type2), \
+ .unique = ((struct fuse_in_header *) \
+ bytes_in)->unique, \
+ }; \
+ *(struct type1 *) (bytes_out \
+ + sizeof(struct fuse_out_header)) \
+ = obj1; \
+ *(struct type2 *) (bytes_out \
+ + sizeof(struct fuse_out_header) \
+ + sizeof(struct type1)) \
+ = obj2; \
+ } while (false)
+
+#define TESTFUSEOUT2(type1, obj1, type2, obj2) \
+ do { \
+ SETFUSEOUT2(type1, obj1, type2, obj2); \
+ TESTEQUAL(write(fuse_dev, bytes_out, \
+ ((struct fuse_out_header *)bytes_out)->len), \
+ ((struct fuse_out_header *)bytes_out)->len); \
+ } while (false)
+
+#define TESTFUSEOUT2_IOCTL(type1, obj1, type2, obj2) \
+ do { \
+ SETFUSEOUT2(type1, obj1, type2, obj2); \
+ TESTEQUAL(ioctl(fuse_dev, \
+ FUSE_DEV_IOC_BPF_RESPONSE( \
+ ((struct fuse_out_header *)bytes_out)->len), \
+ bytes_out), \
+ ((struct fuse_out_header *)bytes_out)->len); \
+ } while (false)
+
+#define SETFUSEOUT3(type1, obj1, type2, obj2, type3, obj3) \
+ do { \
+ *(struct fuse_out_header *) bytes_out \
+ = (struct fuse_out_header) { \
+ .len = sizeof(struct fuse_out_header) \
+ + sizeof(struct type1) \
+ + sizeof(struct type2) \
+ + sizeof(struct type3), \
+ .unique = ((struct fuse_in_header *) \
+ bytes_in)->unique, \
+ }; \
+ *(struct type1 *) (bytes_out \
+ + sizeof(struct fuse_out_header)) \
+ = obj1; \
+ *(struct type2 *) (bytes_out \
+ + sizeof(struct fuse_out_header) \
+ + sizeof(struct type1)) \
+ = obj2; \
+ *(struct type3 *) (bytes_out \
+ + sizeof(struct fuse_out_header) \
+ + sizeof(struct type1) \
+ + sizeof(struct type2)) \
+ = obj3; \
+ } while (false)
+
+#define TESTFUSEOUT3(type1, obj1, type2, obj2, type3, obj3) \
+ do { \
+ SETFUSEOUT3(type1, obj1, type2, obj2, type3, obj3); \
+ TESTEQUAL(write(fuse_dev, bytes_out, \
+ ((struct fuse_out_header *)bytes_out)->len), \
+ ((struct fuse_out_header *)bytes_out)->len); \
+ } while (false)
+
+#define TESTFUSEOUT3_FAIL(type1, obj1, type2, obj2, type3, obj3) \
+ do { \
+ SETFUSEOUT3(type1, obj1, type2, obj2, type3, obj3); \
+ TESTEQUAL(write(fuse_dev, bytes_out, \
+ ((struct fuse_out_header *)bytes_out)->len), \
+ -1); \
+ } while (false)
+
+#define FUSE_DEV_IOC_BPF_RESPONSE(N) _IOW(FUSE_DEV_IOC_MAGIC, 125, char[N])
+
+#define TESTFUSEOUT3_IOCTL(type1, obj1, type2, obj2, type3, obj3) \
+ do { \
+ SETFUSEOUT3(type1, obj1, type2, obj2, type3, obj3); \
+ TESTEQUAL(ioctl(fuse_dev, \
+ FUSE_DEV_IOC_BPF_RESPONSE( \
+ ((struct fuse_out_header *)bytes_out)->len), \
+ bytes_out), \
+ ((struct fuse_out_header *)bytes_out)->len); \
+ } while (false)
+
+#define TESTFUSEINITFLAGS(fuse_connection_flags) \
+ do { \
+ DECL_FUSE_IN(init); \
+ \
+ TESTFUSEIN(FUSE_INIT, init_in); \
+ TESTEQUAL(init_in->major, FUSE_KERNEL_VERSION); \
+ TESTEQUAL(init_in->minor, FUSE_KERNEL_MINOR_VERSION); \
+ TESTFUSEOUT1(fuse_init_out, ((struct fuse_init_out) { \
+ .major = FUSE_KERNEL_VERSION, \
+ .minor = FUSE_KERNEL_MINOR_VERSION, \
+ .max_readahead = 4096, \
+ .flags = fuse_connection_flags, \
+ .max_background = 0, \
+ .congestion_threshold = 0, \
+ .max_write = 4096, \
+ .time_gran = 1000, \
+ .max_pages = 12, \
+ .map_alignment = 4096, \
+ })); \
+ } while (false)
+
+#define TESTFUSEINIT() \
+ TESTFUSEINITFLAGS(0)
+
+#define DECL_FUSE_IN(name) \
+ struct fuse_##name##_in *name##_in = \
+ (struct fuse_##name##_in *) \
+ (bytes_in + sizeof(struct fuse_in_header))
+
+#define DECL_FUSE(name) \
+ struct fuse_##name##_in *name##_in __attribute__((unused)); \
+ struct fuse_##name##_out *name##_out __attribute__((unused))
+
+#define FUSE_ACTION TEST(pid = fork(), pid != -1); \
+ if (pid) {
+
+#define FUSE_DAEMON } else { \
+ uint8_t bytes_in[FUSE_MIN_READ_BUFFER] \
+ __attribute__((unused)); \
+ uint8_t bytes_out[FUSE_MIN_READ_BUFFER] \
+ __attribute__((unused));
+
+#define FUSE_DONE exit(TEST_SUCCESS); \
+ } \
+ TESTEQUAL(waitpid(pid, &status, 0), pid); \
+ TESTEQUAL(status, TEST_SUCCESS);
+
+int mount_fuse(const char *mount_dir, const char *bpf_name, int dir_fd,
+ int *fuse_dev_ptr);
+int mount_fuse_no_init(const char *mount_dir, const char *bpf_name, int dir_fd,
+ int *fuse_dev_ptr);
+#endif
--
2.40.0.634.g4ca3ef3211-goog

2023-04-18 02:01:23

by Daniel Rosenberg

[permalink] [raw]
Subject: [RFC PATCH v3 37/37] fuse: Provide easy way to test fuse struct_op call

This is useful for quickly testing a struct_op program.
I've been using this set up to test verifier changes.

I'll eventually move those sorts of tests to bpf selftests

Signed-off-by: Daniel Rosenberg <[email protected]>
---
fs/fuse/inode.c | 70 ++
.../selftests/filesystems/fuse/Makefile | 1 +
.../filesystems/fuse/struct_op_test.bpf.c | 642 ++++++++++++++++++
3 files changed, 713 insertions(+)
create mode 100644 tools/testing/selftests/filesystems/fuse/struct_op_test.bpf.c

diff --git a/fs/fuse/inode.c b/fs/fuse/inode.c
index 7fd79efbdac1..d80c7282c91c 100644
--- a/fs/fuse/inode.c
+++ b/fs/fuse/inode.c
@@ -2071,16 +2071,83 @@ static void fuse_fs_cleanup(void)

static struct kobject *fuse_kobj;

+static char struct_op_name[BPF_FUSE_NAME_MAX];
+static struct fuse_ops *fop = NULL;
+
+static ssize_t struct_op_store(struct kobject *kobj,
+ struct kobj_attribute *attr,
+ const char *buf, size_t count)
+{
+ size_t max = count;
+
+ if (max > BPF_FUSE_NAME_MAX) max = BPF_FUSE_NAME_MAX;
+ strncpy(struct_op_name, buf, max);
+ if (struct_op_name[max-1] == '\n')
+ struct_op_name[max-1] = 0;
+ put_fuse_ops(fop);
+ fop = find_fuse_ops(struct_op_name);
+ if (!fop)
+ printk("No struct op named %s found", struct_op_name);
+
+ return count;
+}
+
+static ssize_t struct_op_show(struct kobject *kobj,
+ struct kobj_attribute *attr, char *buf)
+{
+ struct fuse_ops *op;
+ uint32_t result = 0;
+ struct bpf_fuse_meta_info meta;
+ struct fuse_mkdir_in in;
+ struct fuse_buffer name;
+ char name_buff[10] = "test";
+
+ name.data = &name_buff[0];
+ name.flags = BPF_FUSE_VARIABLE_SIZE;
+ name.max_size = 10;
+ name.size = 5;
+
+ op = fop;
+ if (!op) {
+ printk("Could not find fuse_op for %s", struct_op_name);
+ return 0;
+ }
+
+ if (op->mkdir_prefilter)
+ result = op->mkdir_prefilter(&meta, &in, &name);
+ else
+ printk("No func!!");
+
+ printk("in->mode:%d, name:%s result:%d", in.mode, (char *)name.data, result);
+ return sprintf(buf, "%d dyn:%s\n", result, (char *)name.data);
+}
+
+static struct kobj_attribute test_attr = __ATTR_RW(struct_op);
+
+static struct attribute *test_attrs[] = {
+ &test_attr.attr,
+ NULL,
+};
+
+static const struct attribute_group test_attr_group = {
+ .attrs = test_attrs,
+};
+
static int fuse_sysfs_init(void)
{
int err;

+ memset(struct_op_name, 0, BPF_FUSE_NAME_MAX);
fuse_kobj = kobject_create_and_add("fuse", fs_kobj);
if (!fuse_kobj) {
err = -ENOMEM;
goto out_err;
}

+ err = sysfs_create_group(fuse_kobj, &test_attr_group);
+ if (err)
+ goto tmp;
+
err = sysfs_create_mount_point(fuse_kobj, "connections");
if (err)
goto out_fuse_unregister;
@@ -2089,6 +2156,8 @@ static int fuse_sysfs_init(void)

out_fuse_unregister:
kobject_put(fuse_kobj);
+tmp:
+ sysfs_remove_group(fuse_kobj, &test_attr_group);
out_err:
return err;
}
@@ -2096,6 +2165,7 @@ static int fuse_sysfs_init(void)
static void fuse_sysfs_cleanup(void)
{
sysfs_remove_mount_point(fuse_kobj, "connections");
+ sysfs_remove_group(fuse_kobj, &test_attr_group);
kobject_put(fuse_kobj);
}

diff --git a/tools/testing/selftests/filesystems/fuse/Makefile b/tools/testing/selftests/filesystems/fuse/Makefile
index b2df4dec0651..ff28859f3268 100644
--- a/tools/testing/selftests/filesystems/fuse/Makefile
+++ b/tools/testing/selftests/filesystems/fuse/Makefile
@@ -52,6 +52,7 @@ SELFTESTS:=$(TOOLSDIR)/testing/selftests/
LDLIBS := -lpthread -lelf -lz
TEST_GEN_PROGS := fuse_test fuse_daemon
TEST_GEN_FILES := \
+ struct_op_test.bpf.o \
test.skel.h \
fd.sh \

diff --git a/tools/testing/selftests/filesystems/fuse/struct_op_test.bpf.c b/tools/testing/selftests/filesystems/fuse/struct_op_test.bpf.c
new file mode 100644
index 000000000000..2cb178d2fa0c
--- /dev/null
+++ b/tools/testing/selftests/filesystems/fuse/struct_op_test.bpf.c
@@ -0,0 +1,642 @@
+// SPDX-License-Identifier: GPL-2.0 OR BSD-3-Clause
+// Copyright (c) 2021 Google LLC
+
+#include "vmlinux.h"
+//#include <uapi/linux/bpf.h>
+#include <linux/errno.h>
+#include <linux/types.h>
+//#include <linux/fuse.h>
+#include <bpf/bpf_tracing.h>
+#include <bpf/bpf_helpers.h>
+#include "bpf_common.h"
+
+char _license[] SEC("license") = "GPL";
+
+#define BPF_STRUCT_OPS(type, name, args...) \
+SEC("struct_ops/"#name) \
+type BPF_PROG(name, ##args)
+
+/*
+struct test_struct {
+ uint32_t a;
+ uint32_t b;
+};
+
+
+*/
+//struct fuse_buffer;
+#define BPF_FUSE_CONTINUE 0
+/*struct fuse_ops {
+ uint32_t (*test_func)(void);
+ uint32_t (*test_func2)(struct test_struct *a);
+ uint32_t (*test_func3)(struct fuse_name *ptr);
+ //u32 (*open_prefilter)(struct bpf_fuse_hidden_info meh, struct bpf_fuse_meta_info header, struct fuse_open_in foi);
+ //u32 (*open_postfilter)(struct bpf_fuse_hidden_info meh, struct bpf_fuse_meta_info header, const struct fuse_open_in foi, struct fuse_open_out foo);
+ char name[BPF_FUSE_NAME_MAX];
+};
+*/
+extern uint32_t bpf_fuse_return_len(struct fuse_buffer *ptr) __ksym;
+extern void bpf_fuse_get_rw_dynptr(struct fuse_buffer *buffer, struct bpf_dynptr *dynptr, u64 size, bool copy) __ksym;
+extern void bpf_fuse_get_ro_dynptr(const struct fuse_buffer *buffer, struct bpf_dynptr *dynptr) __ksym;
+
+//extern struct bpf_key *bpf_lookup_user_key(__u32 serial, __u64 flags) __ksym;
+//extern struct bpf_key *bpf_lookup_system_key(__u64 id) __ksym;
+//extern void bpf_key_put(struct bpf_key *key) __ksym;
+//extern int bpf_verify_pkcs7_signature(struct bpf_dynptr *data_ptr,
+// struct bpf_dynptr *sig_ptr,
+// struct bpf_key *trusted_keyring) __ksym;
+
+BPF_STRUCT_OPS(uint32_t, test_func, const struct bpf_fuse_meta_info *meta,
+ struct fuse_mkdir_in *in, struct fuse_buffer *name)
+{
+ int res = 0;
+ struct bpf_dynptr name_ptr;
+ char *name_buf;
+ //char dummy[7] = {};
+
+ bpf_fuse_get_ro_dynptr(name, &name_ptr);
+ name_buf = bpf_dynptr_slice(&name_ptr, 0, NULL, 4);
+ bpf_printk("Hello test print");
+ if (!name_buf)
+ return -ENOMEM;
+ if (!bpf_strncmp(name_buf, 4, "test"))
+ return 42;
+
+ //if (bpf_fuse_namecmp(name, "test", 4) == 0)
+ // return 42;
+
+ return res;
+}
+
+SEC(".struct_ops")
+struct fuse_ops test_ops = {
+ .mkdir_prefilter = (void *)test_func,
+ .name = "test",
+};
+
+BPF_STRUCT_OPS(uint32_t, open_prefilter, const struct bpf_fuse_meta_info *meta,
+ struct fuse_open_in *in)
+{
+ bpf_printk("open_prefilter");
+ return BPF_FUSE_CONTINUE;
+}
+
+BPF_STRUCT_OPS(uint32_t, open_postfilter, const struct bpf_fuse_meta_info *meta,
+ const struct fuse_open_in *in,
+ struct fuse_open_out *out)
+{
+ bpf_printk("open_postfilter");
+ return BPF_FUSE_CONTINUE;
+}
+
+BPF_STRUCT_OPS(uint32_t, opendir_prefilter, const struct bpf_fuse_meta_info *meta,
+ struct fuse_open_in *in)
+{
+ bpf_printk("opendir_prefilter");
+ return BPF_FUSE_CONTINUE;
+}
+
+BPF_STRUCT_OPS(uint32_t, opendir_postfilter, const struct bpf_fuse_meta_info *meta,
+ const struct fuse_open_in *in,
+ struct fuse_open_out *out)
+{
+ bpf_printk("opendir_postfilter");
+ return BPF_FUSE_CONTINUE;
+}
+
+BPF_STRUCT_OPS(uint32_t, create_open_prefilter, const struct bpf_fuse_meta_info *meta,
+ struct fuse_create_in *in, struct fuse_buffer *name)
+{
+ bpf_printk("create_open_prefilter");
+ return BPF_FUSE_CONTINUE;
+}
+
+BPF_STRUCT_OPS(uint32_t, create_open_postfilter, const struct bpf_fuse_meta_info *meta,
+ const struct fuse_create_in *in, const struct fuse_buffer *name,
+ struct fuse_entry_out *entry_out, struct fuse_open_out *out)
+{
+ bpf_printk("create_open_postfilter");
+ return BPF_FUSE_CONTINUE;
+}
+
+BPF_STRUCT_OPS(uint32_t, release_prefilter, const struct bpf_fuse_meta_info *meta,
+ struct fuse_release_in *in)
+{
+ bpf_printk("release_prefilter");
+ return BPF_FUSE_CONTINUE;
+}
+
+BPF_STRUCT_OPS(uint32_t, release_postfilter, const struct bpf_fuse_meta_info *meta,
+ const struct fuse_release_in *in)
+{
+ bpf_printk("release_postfilter");
+ return BPF_FUSE_CONTINUE;
+}
+
+BPF_STRUCT_OPS(uint32_t, releasedir_prefilter, const struct bpf_fuse_meta_info *meta,
+ struct fuse_release_in *in)
+{
+ bpf_printk("releasedir_prefilter");
+ return BPF_FUSE_CONTINUE;
+}
+
+BPF_STRUCT_OPS(uint32_t, releasedir_postfilter, const struct bpf_fuse_meta_info *meta,
+ const struct fuse_release_in *in)
+{
+ bpf_printk("releasedir_postfilter");
+ return BPF_FUSE_CONTINUE;
+}
+
+BPF_STRUCT_OPS(uint32_t, flush_prefilter, const struct bpf_fuse_meta_info *meta,
+ struct fuse_flush_in *in)
+{
+ bpf_printk("flush_prefilter");
+ return BPF_FUSE_CONTINUE;
+}
+
+BPF_STRUCT_OPS(uint32_t, flush_postfilter, const struct bpf_fuse_meta_info *meta,
+ const struct fuse_flush_in *in)
+{
+ bpf_printk("flush_postfilter");
+ return BPF_FUSE_CONTINUE;
+}
+
+BPF_STRUCT_OPS(uint32_t, lseek_prefilter, const struct bpf_fuse_meta_info *meta,
+ struct fuse_lseek_in *in)
+{
+ bpf_printk("lseek_prefilter");
+ return BPF_FUSE_CONTINUE;
+}
+
+BPF_STRUCT_OPS(uint32_t, lseek_postfilter, const struct bpf_fuse_meta_info *meta,
+ const struct fuse_lseek_in *in,
+ struct fuse_lseek_out *out)
+{
+ bpf_printk("lseek_postfilter");
+ return BPF_FUSE_CONTINUE;
+}
+
+BPF_STRUCT_OPS(uint32_t, copy_file_range_prefilter, const struct bpf_fuse_meta_info *meta,
+ struct fuse_copy_file_range_in *in)
+{
+ bpf_printk("copy_file_range_prefilter");
+ return BPF_FUSE_CONTINUE;
+}
+
+BPF_STRUCT_OPS(uint32_t, copy_file_range_postfilter, const struct bpf_fuse_meta_info *meta,
+ const struct fuse_copy_file_range_in *in,
+ struct fuse_write_out *out)
+{
+ bpf_printk("copy_file_range_postfilter");
+ return BPF_FUSE_CONTINUE;
+}
+
+BPF_STRUCT_OPS(uint32_t, fsync_prefilter, const struct bpf_fuse_meta_info *meta,
+ struct fuse_fsync_in *in)
+{
+ bpf_printk("fsync_prefilter");
+ return BPF_FUSE_CONTINUE;
+}
+
+BPF_STRUCT_OPS(uint32_t, fsync_postfilter, const struct bpf_fuse_meta_info *meta,
+ const struct fuse_fsync_in *in)
+{
+ bpf_printk("fsync_postfilter");
+ return BPF_FUSE_CONTINUE;
+}
+
+BPF_STRUCT_OPS(uint32_t, dir_fsync_prefilter, const struct bpf_fuse_meta_info *meta,
+ struct fuse_fsync_in *in)
+{
+ bpf_printk("dir_fsync_prefilter");
+ return BPF_FUSE_CONTINUE;
+}
+
+BPF_STRUCT_OPS(uint32_t, dir_fsync_postfilter, const struct bpf_fuse_meta_info *meta,
+ const struct fuse_fsync_in *in)
+{
+ bpf_printk("dir_fsync_postfilter");
+ return BPF_FUSE_CONTINUE;
+}
+
+BPF_STRUCT_OPS(uint32_t, getxattr_prefilter, const struct bpf_fuse_meta_info *meta,
+ struct fuse_getxattr_in *in, struct fuse_buffer *name)
+{
+ bpf_printk("getxattr_prefilter");
+ return BPF_FUSE_CONTINUE;
+}
+
+BPF_STRUCT_OPS(uint32_t, getxattr_postfilter, const struct bpf_fuse_meta_info *meta,
+ const struct fuse_getxattr_in *in, const struct fuse_buffer *name,
+ struct fuse_buffer *value, struct fuse_getxattr_out *out)
+{
+ bpf_printk("getxattr_postfilter");
+ return BPF_FUSE_CONTINUE;
+}
+
+BPF_STRUCT_OPS(uint32_t, listxattr_prefilter, const struct bpf_fuse_meta_info *meta,
+ struct fuse_getxattr_in *in)
+{
+ bpf_printk("listxattr_prefilter");
+ return BPF_FUSE_CONTINUE;
+}
+
+BPF_STRUCT_OPS(uint32_t, listxattr_postfilter, const struct bpf_fuse_meta_info *meta,
+ const struct fuse_getxattr_in *in,
+ struct fuse_buffer *value, struct fuse_getxattr_out *out)
+{
+ bpf_printk("listxattr_postfilter");
+ return BPF_FUSE_CONTINUE;
+}
+
+BPF_STRUCT_OPS(uint32_t, setxattr_prefilter, const struct bpf_fuse_meta_info *meta,
+ struct fuse_setxattr_in *in, struct fuse_buffer *name,
+ struct fuse_buffer *value)
+{
+ bpf_printk("setxattr_prefilter");
+ return BPF_FUSE_CONTINUE;
+}
+
+BPF_STRUCT_OPS(uint32_t, setxattr_postfilter, const struct bpf_fuse_meta_info *meta,
+ const struct fuse_setxattr_in *in, const struct fuse_buffer *name,
+ const struct fuse_buffer *value)
+{
+ bpf_printk("setxattr_postfilter");
+ return BPF_FUSE_CONTINUE;
+}
+
+BPF_STRUCT_OPS(uint32_t, removexattr_prefilter, const struct bpf_fuse_meta_info *meta,
+ struct fuse_buffer *name)
+{
+ bpf_printk("removexattr_prefilter");
+ return BPF_FUSE_CONTINUE;
+}
+
+BPF_STRUCT_OPS(uint32_t, removexattr_postfilter, const struct bpf_fuse_meta_info *meta,
+ const struct fuse_buffer *name)
+{
+ bpf_printk("removexattr_postfilter");
+ return BPF_FUSE_CONTINUE;
+}
+
+BPF_STRUCT_OPS(uint32_t, read_iter_prefilter, const struct bpf_fuse_meta_info *meta,
+ struct fuse_read_in *in)
+{
+ bpf_printk("read_iter_prefilter");
+ return BPF_FUSE_CONTINUE;
+}
+
+BPF_STRUCT_OPS(uint32_t, read_iter_postfilter, const struct bpf_fuse_meta_info *meta,
+ const struct fuse_read_in *in,
+ struct fuse_read_iter_out *out)
+{
+ bpf_printk("read_iter_postfilter");
+ return BPF_FUSE_CONTINUE;
+}
+
+BPF_STRUCT_OPS(uint32_t, write_iter_prefilter, const struct bpf_fuse_meta_info *meta,
+ struct fuse_write_in *in)
+{
+ bpf_printk("write_iter_prefilter");
+ return BPF_FUSE_CONTINUE;
+}
+
+BPF_STRUCT_OPS(uint32_t, write_iter_postfilter, const struct bpf_fuse_meta_info *meta,
+ const struct fuse_write_in *in,
+ struct fuse_write_iter_out *out)
+{
+ bpf_printk("write_iter_postfilter");
+ return BPF_FUSE_CONTINUE;
+}
+
+BPF_STRUCT_OPS(uint32_t, file_fallocate_prefilter, const struct bpf_fuse_meta_info *meta,
+ struct fuse_fallocate_in *in)
+{
+ bpf_printk("file_fallocate_prefilter");
+ return BPF_FUSE_CONTINUE;
+}
+
+BPF_STRUCT_OPS(uint32_t, file_fallocate_postfilter, const struct bpf_fuse_meta_info *meta,
+ const struct fuse_fallocate_in *in)
+{
+ bpf_printk("file_fallocate_postfilter");
+ return BPF_FUSE_CONTINUE;
+}
+
+BPF_STRUCT_OPS(uint32_t, lookup_prefilter, const struct bpf_fuse_meta_info *meta,
+ struct fuse_buffer *name)
+{
+ bpf_printk("lookup_prefilter");
+ return BPF_FUSE_CONTINUE;
+}
+
+BPF_STRUCT_OPS(uint32_t, lookup_postfilter, const struct bpf_fuse_meta_info *meta,
+ const struct fuse_buffer *name,
+ struct fuse_entry_out *out, struct fuse_buffer *entries)
+{
+ bpf_printk("lookup_postfilter");
+ return BPF_FUSE_CONTINUE;
+}
+
+BPF_STRUCT_OPS(uint32_t, mknod_prefilter, const struct bpf_fuse_meta_info *meta,
+ struct fuse_mknod_in *in, struct fuse_buffer *name)
+{
+ bpf_printk("mknod_prefilter");
+ return BPF_FUSE_CONTINUE;
+}
+
+BPF_STRUCT_OPS(uint32_t, mknod_postfilter, const struct bpf_fuse_meta_info *meta,
+ const struct fuse_mknod_in *in, const struct fuse_buffer *name)
+{
+ bpf_printk("mknod_postfilter");
+ return BPF_FUSE_CONTINUE;
+}
+
+BPF_STRUCT_OPS(uint32_t, mkdir_prefilter, const struct bpf_fuse_meta_info *meta,
+ struct fuse_mkdir_in *in, struct fuse_buffer *name)
+{
+ bpf_printk("mkdir_prefilter");
+ return BPF_FUSE_CONTINUE;
+}
+
+BPF_STRUCT_OPS(uint32_t, mkdir_postfilter, const struct bpf_fuse_meta_info *meta,
+ const struct fuse_mkdir_in *in, const struct fuse_buffer *name)
+{
+ bpf_printk("mkdir_postfilter");
+ return BPF_FUSE_CONTINUE;
+}
+
+BPF_STRUCT_OPS(uint32_t, rmdir_prefilter, const struct bpf_fuse_meta_info *meta,
+ struct fuse_buffer *name)
+{
+ bpf_printk("rmdir_prefilter");
+ return BPF_FUSE_CONTINUE;
+}
+
+BPF_STRUCT_OPS(uint32_t, rmdir_postfilter, const struct bpf_fuse_meta_info *meta,
+ const struct fuse_buffer *name)
+{
+ bpf_printk("rmdir_postfilter");
+ return BPF_FUSE_CONTINUE;
+}
+
+BPF_STRUCT_OPS(uint32_t, rename2_prefilter, const struct bpf_fuse_meta_info *meta,
+ struct fuse_rename2_in *in, struct fuse_buffer *old_name,
+ struct fuse_buffer *new_name)
+{
+ bpf_printk("rename2_prefilter");
+ return BPF_FUSE_CONTINUE;
+}
+
+BPF_STRUCT_OPS(uint32_t, rename2_postfilter, const struct bpf_fuse_meta_info *meta,
+ const struct fuse_rename2_in *in, const struct fuse_buffer *old_name,
+ const struct fuse_buffer *new_name)
+{
+ bpf_printk("rename2_postfilter");
+ return BPF_FUSE_CONTINUE;
+}
+
+BPF_STRUCT_OPS(uint32_t, rename_prefilter, const struct bpf_fuse_meta_info *meta,
+ struct fuse_rename_in *in, struct fuse_buffer *old_name,
+ struct fuse_buffer *new_name)
+{
+ bpf_printk("rename_prefilter");
+ return BPF_FUSE_CONTINUE;
+}
+
+BPF_STRUCT_OPS(uint32_t, rename_postfilter, const struct bpf_fuse_meta_info *meta,
+ const struct fuse_rename_in *in, const struct fuse_buffer *old_name,
+ const struct fuse_buffer *new_name)
+{
+ bpf_printk("rename_postfilter");
+ return BPF_FUSE_CONTINUE;
+}
+
+BPF_STRUCT_OPS(uint32_t, unlink_prefilter, const struct bpf_fuse_meta_info *meta,
+ struct fuse_buffer *name)
+{
+ bpf_printk("unlink_prefilter");
+ return BPF_FUSE_CONTINUE;
+}
+
+BPF_STRUCT_OPS(uint32_t, unlink_postfilter, const struct bpf_fuse_meta_info *meta,
+ const struct fuse_buffer *name)
+{
+ bpf_printk("unlink_postfilter");
+ return BPF_FUSE_CONTINUE;
+}
+
+BPF_STRUCT_OPS(uint32_t, link_prefilter, const struct bpf_fuse_meta_info *meta,
+ struct fuse_link_in *in, struct fuse_buffer *name)
+{
+ bpf_printk("link_prefilter");
+ return BPF_FUSE_CONTINUE;
+}
+
+BPF_STRUCT_OPS(uint32_t, link_postfilter, const struct bpf_fuse_meta_info *meta,
+ const struct fuse_link_in *in, const struct fuse_buffer *name)
+{
+ bpf_printk("link_postfilter");
+ return BPF_FUSE_CONTINUE;
+}
+
+BPF_STRUCT_OPS(uint32_t, getattr_prefilter, const struct bpf_fuse_meta_info *meta,
+ struct fuse_getattr_in *in)
+{
+ bpf_printk("getattr_prefilter");
+ return BPF_FUSE_CONTINUE;
+}
+
+BPF_STRUCT_OPS(uint32_t, getattr_postfilter, const struct bpf_fuse_meta_info *meta,
+ const struct fuse_getattr_in *in,
+ struct fuse_attr_out *out)
+{
+ bpf_printk("getattr_postfilter");
+ return BPF_FUSE_CONTINUE;
+}
+
+BPF_STRUCT_OPS(uint32_t, setattr_prefilter, const struct bpf_fuse_meta_info *meta,
+ struct fuse_setattr_in *in)
+{
+ bpf_printk("setattr_prefilter");
+ return BPF_FUSE_CONTINUE;
+}
+
+BPF_STRUCT_OPS(uint32_t, setattr_postfilter, const struct bpf_fuse_meta_info *meta,
+ const struct fuse_setattr_in *in,
+ struct fuse_attr_out *out)
+{
+ bpf_printk("setattr_postfilter");
+ return BPF_FUSE_CONTINUE;
+}
+
+BPF_STRUCT_OPS(uint32_t, statfs_prefilter, const struct bpf_fuse_meta_info *meta)
+{
+ bpf_printk("statfs_prefilter");
+ return BPF_FUSE_CONTINUE;
+}
+
+BPF_STRUCT_OPS(uint32_t, statfs_postfilter, const struct bpf_fuse_meta_info *meta,
+ struct fuse_statfs_out *out)
+{
+ bpf_printk("statfs_postfilter");
+ return BPF_FUSE_CONTINUE;
+}
+
+BPF_STRUCT_OPS(uint32_t, get_link_prefilter, const struct bpf_fuse_meta_info *meta,
+ struct fuse_buffer *name)
+{
+ bpf_printk("get_link_prefilter");
+ return BPF_FUSE_CONTINUE;
+}
+
+BPF_STRUCT_OPS(uint32_t, get_link_postfilter, const struct bpf_fuse_meta_info *meta,
+ const struct fuse_buffer *name)
+{
+ bpf_printk("get_link_postfilter");
+ return BPF_FUSE_CONTINUE;
+}
+
+BPF_STRUCT_OPS(uint32_t, symlink_prefilter, const struct bpf_fuse_meta_info *meta,
+ struct fuse_buffer *name, struct fuse_buffer *path)
+{
+ bpf_printk("symlink_prefilter");
+ return BPF_FUSE_CONTINUE;
+}
+
+BPF_STRUCT_OPS(uint32_t, symlink_postfilter, const struct bpf_fuse_meta_info *meta,
+ const struct fuse_buffer *name, const struct fuse_buffer *path)
+{
+ bpf_printk("symlink_postfilter");
+ return BPF_FUSE_CONTINUE;
+}
+
+BPF_STRUCT_OPS(uint32_t, readdir_prefilter, const struct bpf_fuse_meta_info *meta,
+ struct fuse_read_in *in)
+{
+ bpf_printk("readdir_prefilter");
+ return BPF_FUSE_CONTINUE;
+}
+
+BPF_STRUCT_OPS(uint32_t, readdir_postfilter, const struct bpf_fuse_meta_info *meta,
+ const struct fuse_read_in *in,
+ struct fuse_read_out *out, struct fuse_buffer *buffer)
+{
+ bpf_printk("readdir_postfilter");
+ return BPF_FUSE_CONTINUE;
+}
+
+BPF_STRUCT_OPS(uint32_t, access_prefilter, const struct bpf_fuse_meta_info *meta,
+ struct fuse_access_in *in)
+{
+ bpf_printk("access_prefilter");
+ return BPF_FUSE_CONTINUE;
+}
+
+BPF_STRUCT_OPS(uint32_t, access_postfilter, const struct bpf_fuse_meta_info *meta,
+ const struct fuse_access_in *in)
+{
+ bpf_printk("access_postfilter");
+ return BPF_FUSE_CONTINUE;
+}
+
+SEC(".struct_ops")
+struct fuse_ops trace_ops = {
+ .open_prefilter = (void *)open_prefilter,
+ .open_postfilter = (void *)open_postfilter,
+
+ .opendir_prefilter = (void *)opendir_prefilter,
+ .opendir_postfilter = (void *)opendir_postfilter,
+
+ .create_open_prefilter = (void *)create_open_prefilter,
+ .create_open_postfilter = (void *)create_open_postfilter,
+
+ .release_prefilter = (void *)release_prefilter,
+ .release_postfilter = (void *)release_postfilter,
+
+ .releasedir_prefilter = (void *)releasedir_prefilter,
+ .releasedir_postfilter = (void *)releasedir_postfilter,
+
+ .flush_prefilter = (void *)flush_prefilter,
+ .flush_postfilter = (void *)flush_postfilter,
+
+ .lseek_prefilter = (void *)lseek_prefilter,
+ .lseek_postfilter = (void *)lseek_postfilter,
+
+ .copy_file_range_prefilter = (void *)copy_file_range_prefilter,
+ .copy_file_range_postfilter = (void *)copy_file_range_postfilter,
+
+ .fsync_prefilter = (void *)fsync_prefilter,
+ .fsync_postfilter = (void *)fsync_postfilter,
+
+ .dir_fsync_prefilter = (void *)dir_fsync_prefilter,
+ .dir_fsync_postfilter = (void *)dir_fsync_postfilter,
+
+ .getxattr_prefilter = (void *)getxattr_prefilter,
+ .getxattr_postfilter = (void *)getxattr_postfilter,
+
+ .listxattr_prefilter = (void *)listxattr_prefilter,
+ .listxattr_postfilter = (void *)listxattr_postfilter,
+
+ .setxattr_prefilter = (void *)setxattr_prefilter,
+ .setxattr_postfilter = (void *)setxattr_postfilter,
+
+ .removexattr_prefilter = (void *)removexattr_prefilter,
+ .removexattr_postfilter = (void *)removexattr_postfilter,
+
+ .read_iter_prefilter = (void *)read_iter_prefilter,
+ .read_iter_postfilter = (void *)read_iter_postfilter,
+
+ .write_iter_prefilter = (void *)write_iter_prefilter,
+ .write_iter_postfilter = (void *)write_iter_postfilter,
+
+ .file_fallocate_prefilter = (void *)file_fallocate_prefilter,
+ .file_fallocate_postfilter = (void *)file_fallocate_postfilter,
+
+ .lookup_prefilter = (void *)lookup_prefilter,
+ .lookup_postfilter = (void *)lookup_postfilter,
+
+ .mknod_prefilter = (void *)mknod_prefilter,
+ .mknod_postfilter = (void *)mknod_postfilter,
+
+ .mkdir_prefilter = (void *)mkdir_prefilter,
+ .mkdir_postfilter = (void *)mkdir_postfilter,
+
+ .rmdir_prefilter = (void *)rmdir_prefilter,
+ .rmdir_postfilter = (void *)rmdir_postfilter,
+
+ .rename2_prefilter = (void *)rename2_prefilter,
+ .rename2_postfilter = (void *)rename2_postfilter,
+
+ .rename_prefilter = (void *)rename_prefilter,
+ .rename_postfilter = (void *)rename_postfilter,
+
+ .unlink_prefilter = (void *)unlink_prefilter,
+ .unlink_postfilter = (void *)unlink_postfilter,
+
+ .link_prefilter = (void *)link_prefilter,
+ .link_postfilter = (void *)link_postfilter,
+
+ .getattr_prefilter = (void *)getattr_prefilter,
+ .getattr_postfilter = (void *)getattr_postfilter,
+
+ .setattr_prefilter = (void *)setattr_prefilter,
+ .setattr_postfilter = (void *)setattr_postfilter,
+
+ .statfs_prefilter = (void *)statfs_prefilter,
+ .statfs_postfilter = (void *)statfs_postfilter,
+
+ .get_link_prefilter = (void *)get_link_prefilter,
+ .get_link_postfilter = (void *)get_link_postfilter,
+
+ .symlink_prefilter = (void *)symlink_prefilter,
+ .symlink_postfilter = (void *)symlink_postfilter,
+
+ .readdir_prefilter = (void *)readdir_prefilter,
+ .readdir_postfilter = (void *)readdir_postfilter,
+
+ .access_prefilter = (void *)access_prefilter,
+ .access_postfilter = (void *)access_postfilter,
+
+ .name = "trace_pre_ops",
+};
--
2.40.0.634.g4ca3ef3211-goog

2023-04-18 05:34:31

by Amir Goldstein

[permalink] [raw]
Subject: Re: [RFC PATCH bpf-next v3 00/37] FUSE BPF: A Stacked Filesystem Extension for FUSE

On Tue, Apr 18, 2023 at 4:40 AM Daniel Rosenberg <[email protected]> wrote:
>
> These patches extend FUSE to be able to act as a stacked filesystem. This
> allows pure passthrough, where the fuse file system simply reflects the lower
> filesystem, and also allows optional pre and post filtering in BPF and/or the
> userspace daemon as needed. This can dramatically reduce or even eliminate
> transitions to and from userspace.
>
> In this patch set, I've reworked the bpf code to add a new struct_op type
> instead of a new program type, and used new kfuncs in place of new helpers.
> Additionally, it now uses dynptrs for variable sized buffers. The first three
> patches are repeats of a previous patch set which I have not yet adjusted for
> comments. I plan to adjust those and submit them separately with fixes, but
> wanted to have the current fuse-bpf code visible before then.
>
> Patches 4-7 mostly rearrange existing code to remove noise from the main patch.
> Patch 8 contains the main sections of fuse-bpf
> Patches 9-25 implementing most FUSE functions as operations on a lower
> filesystem. From patch 25, you can run fuse as a passthrough filesystem.
> Patches 26-32 provide bpf functionality so that you can alter fuse parameters
> via fuse_op programs.
> Patch 33 extends this to userspace, and patches 34-37 add some testing
> functionality.
>

That's a nice logical breakup for review.

I feel there is so much subtle code in those patches that the
only sane path forward is to review and merge them in phases.

Your patches adds this config:

+config FUSE_BPF
+ bool "Adds BPF to fuse"
+ depends on FUSE_FS
+ depends on BPF
+ help
+ Extends FUSE by adding BPF to prefilter calls and
potentially pass to a
+ backing file system

Since your patches add the PASSTHROUGH functionality before adding
BPF functionality, would it make sense to review and merge the PASSTHROUGH
functionality strictly before the BPF functionality?

Alternatively, you could aim to merge support for some PASSTHROUGH ops
then support for some BPF functionality and then slowly add ops to both.

Which brings me to my biggest concern.
I still do not see how these patches replace Allesio's
FUSE_DEV_IOC_PASSTHROUGH_OPEN patches.

Is the idea here that ioctl needs to be done at FUSE_LOOKUP
instead or in addition to the ioctl on FUSE_OPEN to setup the
read/write passthrough on the backing file?

I am missing things like the FILESYSTEM_MAX_STACK_DEPTH check that
was added as a result of review on Allesio's patches.

The reason I am concerned about this is that we are using the
FUSE_DEV_IOC_PASSTHROUGH_OPEN patches and I would like
to upstream their functionality sooner rather than later.
These patches have already been running in production for a while
I believe that they are running in Android as well and there is value
in upsteaming well tested patches.

The API does not need to stay FUSE_DEV_IOC_PASSTHROUGH_OPEN
it should be an API that is extendable to FUSE-BPF, but it would be
useful if the read/write passthrough could be the goal for first merge.

Does any of this make sense to you?
Can you draw a roadmap for merging FUSE-BPF that starts with
a first (hopefully short term) phase that adds the read/write passthrough
functionality?

I can help with review and testing of that part if needed.
I was planning to discuss this with you on LSFMM anyway,
but better start the discussion beforehand.

Thanks,
Amir.

2023-04-21 01:46:09

by Daniel Rosenberg

[permalink] [raw]
Subject: Re: [RFC PATCH bpf-next v3 00/37] FUSE BPF: A Stacked Filesystem Extension for FUSE

On Mon, Apr 17, 2023 at 10:33 PM Amir Goldstein <[email protected]> wrote:
>
>
> Which brings me to my biggest concern.
> I still do not see how these patches replace Allesio's
> FUSE_DEV_IOC_PASSTHROUGH_OPEN patches.
>
> Is the idea here that ioctl needs to be done at FUSE_LOOKUP
> instead or in addition to the ioctl on FUSE_OPEN to setup the
> read/write passthrough on the backing file?
>

In these patches, the fuse daemon responds to the lookup request via
an ioctl, essentially in the same way it would have to the /dev/fuse
node. It just flags the write as coming from an ioctl and calls
fuse_dev_do_write. An additional block in the lookup response gives
the backing file and what bpf_ops to use. The main difference is that
fuse-bpf uses backing inodes, while passthrough uses a file.
Fuse-bpf's read/write support currently isn't complete, but it does
allow for direct passthrough. You could set ops to default to
userspace in every case that Allesio's passthrough code does and it
should have about the same effect. With the struct_op change, I did
notice that doing something like that is more annoying, and am
planning to add a default op which only takes the meta info and runs
if the opcode specific op is not present.


> I am missing things like the FILESYSTEM_MAX_STACK_DEPTH check that
> was added as a result of review on Allesio's patches.
>

I'd definitely want to fix any issues that were fixed there. There's a
lot of common code between fuse-bpf and fuse passthrough, so many of
the suggestions there will apply here.

> The reason I am concerned about this is that we are using the
> FUSE_DEV_IOC_PASSTHROUGH_OPEN patches and I would like
> to upstream their functionality sooner rather than later.
> These patches have already been running in production for a while
> I believe that they are running in Android as well and there is value
> in upsteaming well tested patches.
>
> The API does not need to stay FUSE_DEV_IOC_PASSTHROUGH_OPEN
> it should be an API that is extendable to FUSE-BPF, but it would be
> useful if the read/write passthrough could be the goal for first merge.
>
> Does any of this make sense to you?
> Can you draw a roadmap for merging FUSE-BPF that starts with
> a first (hopefully short term) phase that adds the read/write passthrough
> functionality?
>
> I can help with review and testing of that part if needed.
> I was planning to discuss this with you on LSFMM anyway,
> but better start the discussion beforehand.
>
> Thanks,
> Amir.

We've been using an earlier version of fuse-bpf on Android, closer to
the V1 patches. They fit our current needs but don't cover everything
we intend to. The V3 patches switch to a new style of bpf program,
which I'm hoping to get some feedback on before I spend too much time
fixing up the details. The backing calls themselves can be reviewed
separately from that though.

Without bpf, we're essentially enabling complete passthrough at a
directory or file. By default, once you set a backing file fuse-bpf
calls by the backing filesystem by default, with no additional
userspace interaction apart from if an installed bpf program says
otherwise. If we had some commands without others, we'd have behavior
changes as we introduce support for additional calls. We'd need a way
to set default behavior. Perhaps something like a u64 flag field
extension in FUSE_INIT for indicating which opcodes support backing,
and a response for what those should default to doing. If there's a
bpf_op present for a given opcode, it would be able to override that
default. If we had something like that, we'd be able to add support
for a subset of opcodes in a sensible way.

2023-04-23 14:52:40

by Amir Goldstein

[permalink] [raw]
Subject: Re: [RFC PATCH bpf-next v3 00/37] FUSE BPF: A Stacked Filesystem Extension for FUSE

On Fri, Apr 21, 2023 at 4:41 AM Daniel Rosenberg <[email protected]> wrote:
>
> On Mon, Apr 17, 2023 at 10:33 PM Amir Goldstein <[email protected]> wrote:
> >
> >
> > Which brings me to my biggest concern.
> > I still do not see how these patches replace Allesio's
> > FUSE_DEV_IOC_PASSTHROUGH_OPEN patches.
> >
> > Is the idea here that ioctl needs to be done at FUSE_LOOKUP
> > instead or in addition to the ioctl on FUSE_OPEN to setup the
> > read/write passthrough on the backing file?
> >
>
> In these patches, the fuse daemon responds to the lookup request via
> an ioctl, essentially in the same way it would have to the /dev/fuse
> node. It just flags the write as coming from an ioctl and calls
> fuse_dev_do_write. An additional block in the lookup response gives
> the backing file and what bpf_ops to use. The main difference is that
> fuse-bpf uses backing inodes, while passthrough uses a file.

Ah right. I wonder if there is benefit in both APIs or if backing inode
is sufficient to impelelent everything the could be interesting to implement
with a backing file.

> Fuse-bpf's read/write support currently isn't complete, but it does
> allow for direct passthrough. You could set ops to default to
> userspace in every case that Allesio's passthrough code does and it
> should have about the same effect.

What are the subtle differences then?

> With the struct_op change, I did
> notice that doing something like that is more annoying, and am
> planning to add a default op which only takes the meta info and runs
> if the opcode specific op is not present.
>

Sounds interesting. I'll wait to see what you propose.

>
> > I am missing things like the FILESYSTEM_MAX_STACK_DEPTH check that
> > was added as a result of review on Allesio's patches.
> >
>
> I'd definitely want to fix any issues that were fixed there. There's a
> lot of common code between fuse-bpf and fuse passthrough, so many of
> the suggestions there will apply here.
>

That's why I suggested trying to implement the passthough file ioctl
functionality first to make sure that none of the review comments
in the first round were missed.

But if we need functionality of both ioctls, we can collaborate the
work on merging them separately.

> > The reason I am concerned about this is that we are using the
> > FUSE_DEV_IOC_PASSTHROUGH_OPEN patches and I would like
> > to upstream their functionality sooner rather than later.
> > These patches have already been running in production for a while
> > I believe that they are running in Android as well and there is value
> > in upsteaming well tested patches.
> >
> > The API does not need to stay FUSE_DEV_IOC_PASSTHROUGH_OPEN
> > it should be an API that is extendable to FUSE-BPF, but it would be
> > useful if the read/write passthrough could be the goal for first merge.
> >
> > Does any of this make sense to you?
> > Can you draw a roadmap for merging FUSE-BPF that starts with
> > a first (hopefully short term) phase that adds the read/write passthrough
> > functionality?
> >
> > I can help with review and testing of that part if needed.
> > I was planning to discuss this with you on LSFMM anyway,
> > but better start the discussion beforehand.
> >
> > Thanks,
> > Amir.
>
> We've been using an earlier version of fuse-bpf on Android, closer to
> the V1 patches. They fit our current needs but don't cover everything
> we intend to. The V3 patches switch to a new style of bpf program,
> which I'm hoping to get some feedback on before I spend too much time
> fixing up the details. The backing calls themselves can be reviewed
> separately from that though.
>
> Without bpf, we're essentially enabling complete passthrough at a
> directory or file. By default, once you set a backing file fuse-bpf
> calls by the backing filesystem by default, with no additional
> userspace interaction apart from if an installed bpf program says
> otherwise. If we had some commands without others, we'd have behavior
> changes as we introduce support for additional calls. We'd need a way
> to set default behavior. Perhaps something like a u64 flag field
> extension in FUSE_INIT for indicating which opcodes support backing,
> and a response for what those should default to doing. If there's a
> bpf_op present for a given opcode, it would be able to override that
> default. If we had something like that, we'd be able to add support
> for a subset of opcodes in a sensible way.

So maybe this is something to consider.

Thanks,
Amir.

2023-04-24 15:35:16

by Miklos Szeredi

[permalink] [raw]
Subject: Re: [RFC PATCH bpf-next v3 00/37] FUSE BPF: A Stacked Filesystem Extension for FUSE

On Tue, 18 Apr 2023 at 03:40, Daniel Rosenberg <[email protected]> wrote:
>
> These patches extend FUSE to be able to act as a stacked filesystem. This
> allows pure passthrough, where the fuse file system simply reflects the lower
> filesystem, and also allows optional pre and post filtering in BPF and/or the
> userspace daemon as needed. This can dramatically reduce or even eliminate
> transitions to and from userspace.

I'll ignore BPF for now and concentrate on the passthrough aspect,
which I understand better.

The security model needs to be thought about and documented. Think
about this: the fuse server now delegates operations it would itself
perform to the passthrough code in fuse. The permissions that would
have been checked in the context of the fuse server are now checked in
the context of the task performing the operation. The server may be
able to bypass seccomp restrictions. Files that are open on the
backing filesystem are now hidden (e.g. lsof won't find these), which
allows the server to obfuscate accesses to backing files. Etc.

These are not particularly worrying if the server is privileged, but
fuse comes with the history of supporting unprivileged servers, so we
should look at supporting passthrough with unprivileged servers as
well.

My other generic comment is that you should add justification for
doing this in the first place. I guess it's mainly performance. So
how performance can be won in real life cases? It would also be good
to measure the contribution of individual ops to that win. Is there
another reason for this besides performance?

Thanks,
Miklos

2023-04-27 04:20:22

by Andrii Nakryiko

[permalink] [raw]
Subject: Re: [RFC PATCH v3 28/37] WIP: bpf: Add fuse_ops struct_op programs

On Mon, Apr 17, 2023 at 6:42 PM Daniel Rosenberg <[email protected]> wrote:
>
> This introduces a new struct_op type: fuse_ops. This program set
> provides pre and post filters to run around fuse-bpf calls that act
> directly on the lower filesystem.
>
> The inputs are either fixed structures, or struct fuse_buffer's.
>
> These programs are not permitted to make any changes to these fuse_buffers
> unless they create a dynptr wrapper using the supplied kfunc helpers.
>
> Fuse_buffers maintain additional state information that FUSE uses to
> manage memory and determine if additional set up or checks are needed.
>
> Signed-off-by: Daniel Rosenberg <[email protected]>
> ---
> include/linux/bpf_fuse.h | 189 +++++++++++++++++++++++
> kernel/bpf/Makefile | 4 +
> kernel/bpf/bpf_fuse.c | 241 ++++++++++++++++++++++++++++++
> kernel/bpf/bpf_struct_ops_types.h | 4 +
> kernel/bpf/btf.c | 1 +
> kernel/bpf/verifier.c | 9 ++
> 6 files changed, 448 insertions(+)
> create mode 100644 kernel/bpf/bpf_fuse.c
>
> diff --git a/include/linux/bpf_fuse.h b/include/linux/bpf_fuse.h
> index ce8b1b347496..780a7889aea2 100644
> --- a/include/linux/bpf_fuse.h
> +++ b/include/linux/bpf_fuse.h
> @@ -30,6 +30,8 @@ struct fuse_buffer {
> #define BPF_FUSE_MODIFIED (1 << 3) // The helper function allowed writes to the buffer
> #define BPF_FUSE_ALLOCATED (1 << 4) // The helper function allocated the buffer
>
> +extern void *bpf_fuse_get_writeable(struct fuse_buffer *arg, u64 size, bool copy);
> +
> /*
> * BPF Fuse Args
> *
> @@ -81,4 +83,191 @@ static inline unsigned bpf_fuse_arg_size(const struct bpf_fuse_arg *arg)
> return arg->is_buffer ? arg->buffer->size : arg->size;
> }
>
> +struct fuse_ops {
> + uint32_t (*open_prefilter)(const struct bpf_fuse_meta_info *meta,
> + struct fuse_open_in *in);
> + uint32_t (*open_postfilter)(const struct bpf_fuse_meta_info *meta,
> + const struct fuse_open_in *in,
> + struct fuse_open_out *out);
> +
> + uint32_t (*opendir_prefilter)(const struct bpf_fuse_meta_info *meta,
> + struct fuse_open_in *in);
> + uint32_t (*opendir_postfilter)(const struct bpf_fuse_meta_info *meta,
> + const struct fuse_open_in *in,
> + struct fuse_open_out *out);
> +
> + uint32_t (*create_open_prefilter)(const struct bpf_fuse_meta_info *meta,
> + struct fuse_create_in *in, struct fuse_buffer *name);
> + uint32_t (*create_open_postfilter)(const struct bpf_fuse_meta_info *meta,
> + const struct fuse_create_in *in, const struct fuse_buffer *name,
> + struct fuse_entry_out *entry_out, struct fuse_open_out *out);
> +
> + uint32_t (*release_prefilter)(const struct bpf_fuse_meta_info *meta,
> + struct fuse_release_in *in);
> + uint32_t (*release_postfilter)(const struct bpf_fuse_meta_info *meta,
> + const struct fuse_release_in *in);
> +
> + uint32_t (*releasedir_prefilter)(const struct bpf_fuse_meta_info *meta,
> + struct fuse_release_in *in);
> + uint32_t (*releasedir_postfilter)(const struct bpf_fuse_meta_info *meta,
> + const struct fuse_release_in *in);
> +
> + uint32_t (*flush_prefilter)(const struct bpf_fuse_meta_info *meta,
> + struct fuse_flush_in *in);
> + uint32_t (*flush_postfilter)(const struct bpf_fuse_meta_info *meta,
> + const struct fuse_flush_in *in);
> +
> + uint32_t (*lseek_prefilter)(const struct bpf_fuse_meta_info *meta,
> + struct fuse_lseek_in *in);
> + uint32_t (*lseek_postfilter)(const struct bpf_fuse_meta_info *meta,
> + const struct fuse_lseek_in *in,
> + struct fuse_lseek_out *out);
> +
> + uint32_t (*copy_file_range_prefilter)(const struct bpf_fuse_meta_info *meta,
> + struct fuse_copy_file_range_in *in);
> + uint32_t (*copy_file_range_postfilter)(const struct bpf_fuse_meta_info *meta,
> + const struct fuse_copy_file_range_in *in,
> + struct fuse_write_out *out);
> +
> + uint32_t (*fsync_prefilter)(const struct bpf_fuse_meta_info *meta,
> + struct fuse_fsync_in *in);
> + uint32_t (*fsync_postfilter)(const struct bpf_fuse_meta_info *meta,
> + const struct fuse_fsync_in *in);
> +
> + uint32_t (*dir_fsync_prefilter)(const struct bpf_fuse_meta_info *meta,
> + struct fuse_fsync_in *in);
> + uint32_t (*dir_fsync_postfilter)(const struct bpf_fuse_meta_info *meta,
> + const struct fuse_fsync_in *in);
> +
> + uint32_t (*getxattr_prefilter)(const struct bpf_fuse_meta_info *meta,
> + struct fuse_getxattr_in *in, struct fuse_buffer *name);
> + // if in->size > 0, use value. If in->size == 0, use out.
> + uint32_t (*getxattr_postfilter)(const struct bpf_fuse_meta_info *meta,
> + const struct fuse_getxattr_in *in, const struct fuse_buffer *name,
> + struct fuse_buffer *value, struct fuse_getxattr_out *out);
> +
> + uint32_t (*listxattr_prefilter)(const struct bpf_fuse_meta_info *meta,
> + struct fuse_getxattr_in *in);
> + // if in->size > 0, use value. If in->size == 0, use out.
> + uint32_t (*listxattr_postfilter)(const struct bpf_fuse_meta_info *meta,
> + const struct fuse_getxattr_in *in,
> + struct fuse_buffer *value, struct fuse_getxattr_out *out);
> +
> + uint32_t (*setxattr_prefilter)(const struct bpf_fuse_meta_info *meta,
> + struct fuse_setxattr_in *in, struct fuse_buffer *name,
> + struct fuse_buffer *value);
> + uint32_t (*setxattr_postfilter)(const struct bpf_fuse_meta_info *meta,
> + const struct fuse_setxattr_in *in, const struct fuse_buffer *name,
> + const struct fuse_buffer *value);
> +
> + uint32_t (*removexattr_prefilter)(const struct bpf_fuse_meta_info *meta,
> + struct fuse_buffer *name);
> + uint32_t (*removexattr_postfilter)(const struct bpf_fuse_meta_info *meta,
> + const struct fuse_buffer *name);
> +
> + /* Read and Write iter will likely undergo some sort of change/addition to handle changing
> + * the data buffer passed in/out. */
> + uint32_t (*read_iter_prefilter)(const struct bpf_fuse_meta_info *meta,
> + struct fuse_read_in *in);
> + uint32_t (*read_iter_postfilter)(const struct bpf_fuse_meta_info *meta,
> + const struct fuse_read_in *in,
> + struct fuse_read_iter_out *out);
> +
> + uint32_t (*write_iter_prefilter)(const struct bpf_fuse_meta_info *meta,
> + struct fuse_write_in *in);
> + uint32_t (*write_iter_postfilter)(const struct bpf_fuse_meta_info *meta,
> + const struct fuse_write_in *in,
> + struct fuse_write_iter_out *out);
> +
> + uint32_t (*file_fallocate_prefilter)(const struct bpf_fuse_meta_info *meta,
> + struct fuse_fallocate_in *in);
> + uint32_t (*file_fallocate_postfilter)(const struct bpf_fuse_meta_info *meta,
> + const struct fuse_fallocate_in *in);
> +
> + uint32_t (*lookup_prefilter)(const struct bpf_fuse_meta_info *meta,
> + struct fuse_buffer *name);
> + uint32_t (*lookup_postfilter)(const struct bpf_fuse_meta_info *meta,
> + const struct fuse_buffer *name,
> + struct fuse_entry_out *out, struct fuse_buffer *entries);
> +
> + uint32_t (*mknod_prefilter)(const struct bpf_fuse_meta_info *meta,
> + struct fuse_mknod_in *in, struct fuse_buffer *name);
> + uint32_t (*mknod_postfilter)(const struct bpf_fuse_meta_info *meta,
> + const struct fuse_mknod_in *in, const struct fuse_buffer *name);
> +
> + uint32_t (*mkdir_prefilter)(const struct bpf_fuse_meta_info *meta,
> + struct fuse_mkdir_in *in, struct fuse_buffer *name);
> + uint32_t (*mkdir_postfilter)(const struct bpf_fuse_meta_info *meta,
> + const struct fuse_mkdir_in *in, const struct fuse_buffer *name);
> +
> + uint32_t (*rmdir_prefilter)(const struct bpf_fuse_meta_info *meta,
> + struct fuse_buffer *name);
> + uint32_t (*rmdir_postfilter)(const struct bpf_fuse_meta_info *meta,
> + const struct fuse_buffer *name);
> +
> + uint32_t (*rename2_prefilter)(const struct bpf_fuse_meta_info *meta,
> + struct fuse_rename2_in *in, struct fuse_buffer *old_name,
> + struct fuse_buffer *new_name);
> + uint32_t (*rename2_postfilter)(const struct bpf_fuse_meta_info *meta,
> + const struct fuse_rename2_in *in, const struct fuse_buffer *old_name,
> + const struct fuse_buffer *new_name);
> +
> + uint32_t (*rename_prefilter)(const struct bpf_fuse_meta_info *meta,
> + struct fuse_rename_in *in, struct fuse_buffer *old_name,
> + struct fuse_buffer *new_name);
> + uint32_t (*rename_postfilter)(const struct bpf_fuse_meta_info *meta,
> + const struct fuse_rename_in *in, const struct fuse_buffer *old_name,
> + const struct fuse_buffer *new_name);
> +
> + uint32_t (*unlink_prefilter)(const struct bpf_fuse_meta_info *meta,
> + struct fuse_buffer *name);
> + uint32_t (*unlink_postfilter)(const struct bpf_fuse_meta_info *meta,
> + const struct fuse_buffer *name);
> +
> + uint32_t (*link_prefilter)(const struct bpf_fuse_meta_info *meta,
> + struct fuse_link_in *in, struct fuse_buffer *name);
> + uint32_t (*link_postfilter)(const struct bpf_fuse_meta_info *meta,
> + const struct fuse_link_in *in, const struct fuse_buffer *name);
> +
> + uint32_t (*getattr_prefilter)(const struct bpf_fuse_meta_info *meta,
> + struct fuse_getattr_in *in);
> + uint32_t (*getattr_postfilter)(const struct bpf_fuse_meta_info *meta,
> + const struct fuse_getattr_in *in,
> + struct fuse_attr_out *out);
> +
> + uint32_t (*setattr_prefilter)(const struct bpf_fuse_meta_info *meta,
> + struct fuse_setattr_in *in);
> + uint32_t (*setattr_postfilter)(const struct bpf_fuse_meta_info *meta,
> + const struct fuse_setattr_in *in,
> + struct fuse_attr_out *out);
> +
> + uint32_t (*statfs_prefilter)(const struct bpf_fuse_meta_info *meta);
> + uint32_t (*statfs_postfilter)(const struct bpf_fuse_meta_info *meta,
> + struct fuse_statfs_out *out);
> +
> + //TODO: This does not allow doing anything with path
> + uint32_t (*get_link_prefilter)(const struct bpf_fuse_meta_info *meta,
> + struct fuse_buffer *name);
> + uint32_t (*get_link_postfilter)(const struct bpf_fuse_meta_info *meta,
> + const struct fuse_buffer *name);
> +
> + uint32_t (*symlink_prefilter)(const struct bpf_fuse_meta_info *meta,
> + struct fuse_buffer *name, struct fuse_buffer *path);
> + uint32_t (*symlink_postfilter)(const struct bpf_fuse_meta_info *meta,
> + const struct fuse_buffer *name, const struct fuse_buffer *path);
> +
> + uint32_t (*readdir_prefilter)(const struct bpf_fuse_meta_info *meta,
> + struct fuse_read_in *in);
> + uint32_t (*readdir_postfilter)(const struct bpf_fuse_meta_info *meta,
> + const struct fuse_read_in *in,
> + struct fuse_read_out *out, struct fuse_buffer *buffer);
> +
> + uint32_t (*access_prefilter)(const struct bpf_fuse_meta_info *meta,
> + struct fuse_access_in *in);
> + uint32_t (*access_postfilter)(const struct bpf_fuse_meta_info *meta,
> + const struct fuse_access_in *in);
> +
> + char name[BPF_FUSE_NAME_MAX];
> +};

Have you considered grouping this huge amount of callbacks into a
smaller set of more generic callbacks where each callback would get
enum argument specifying what sort of operation it is called for? This
has many advantages, starting from not having to deal with struct_ops
limits, ending with not needing to instantiate dozens of individual
BPF programs.

E.g., for a lot of operations the difference between pre- and
post-filter is in having in argument as read-only and maybe having
extra out argument for post-filter. One way to unify such post/pre
filters into one callback would be to record whether in has to be
read-only or read-write and not allow to create r/w dynptr for the
former case. Pass bool or enum specifying if it's post or pre filter.
For that optional out argument, you can simulate effectively the same
by always supplying it, but making sure that out parameter is
read-only and zero-sized, for example.

That would cut the number of callbacks in two, which I'd say still is
not great :) I think it would be better still to have even larger
groups of callbacks for whole families of operations with the same (or
"unifiable") interface (domain experts like you would need to do an
analysis here to see what makes sense to group, of course).

We'll probably touch on that tomorrow at BPF office hours, but I
wanted to point this out beforehand, so that you have time to think
about it.

> +
> #endif /* _BPF_FUSE_H */
> diff --git a/kernel/bpf/Makefile b/kernel/bpf/Makefile
> index 1d3892168d32..26a2e741ef61 100644
> --- a/kernel/bpf/Makefile
> +++ b/kernel/bpf/Makefile

[...]

> +__diag_push();
> +__diag_ignore_all("-Wmissing-prototypes",
> + "Global kfuncs as their definitions will be in BTF");
> +void bpf_fuse_get_rw_dynptr(struct fuse_buffer *buffer, struct bpf_dynptr_kern *dynptr__uninit, u64 size, bool copy)

not clear why size is passed from outside instead of instantiating
dynptr with buffer->size? See [0] for bpf_dynptr_adjust and
bpf_dynptr_clone that allow you to adjust buffer as necessary.

As for the copy parameter, can you elaborate on the idea behind it?

[0] https://patchwork.kernel.org/project/netdevbpf/list/?series=741584&state=*

> +{
> + buffer->data = bpf_fuse_get_writeable(buffer, size, copy);
> + bpf_dynptr_init(dynptr__uninit, buffer->data, BPF_DYNPTR_TYPE_LOCAL, 0, buffer->size);
> +}
> +
> +void bpf_fuse_get_ro_dynptr(const struct fuse_buffer *buffer, struct bpf_dynptr_kern *dynptr__uninit)

these kfuncs probably should be more consistently named as
bpf_dynptr_from_fuse_buffer_{ro,rw}() ?

> +{
> + bpf_dynptr_init(dynptr__uninit, buffer->data, BPF_DYNPTR_TYPE_LOCAL, 0, buffer->size);
> + bpf_dynptr_set_rdonly(dynptr__uninit);
> +}
> +
> +uint32_t bpf_fuse_return_len(struct fuse_buffer *buffer)
> +{
> + return buffer->size;

you should be able to get this with bpf_dynptr_size() (once you create
it from fuse_buffer).

> +}
> +__diag_pop();
> +BTF_SET8_START(fuse_kfunc_set)
> +BTF_ID_FLAGS(func, bpf_fuse_get_rw_dynptr)
> +BTF_ID_FLAGS(func, bpf_fuse_get_ro_dynptr)
> +BTF_ID_FLAGS(func, bpf_fuse_return_len)
> +BTF_SET8_END(fuse_kfunc_set)
> +
> +static const struct btf_kfunc_id_set bpf_fuse_kfunc_set = {
> + .owner = THIS_MODULE,
> + .set = &fuse_kfunc_set,
> +};
> +
> +static int __init bpf_fuse_kfuncs_init(void)
> +{
> + return register_btf_kfunc_id_set(BPF_PROG_TYPE_STRUCT_OPS,
> + &bpf_fuse_kfunc_set);
> +}
> +
> +late_initcall(bpf_fuse_kfuncs_init);
> +
> +static const struct bpf_func_proto *bpf_fuse_get_func_proto(enum bpf_func_id func_id,
> + const struct bpf_prog *prog)
> +{
> + switch (func_id) {
> + default:
> + return bpf_base_func_proto(func_id);
> + }
> +}
> +
> +static bool bpf_fuse_is_valid_access(int off, int size,
> + enum bpf_access_type type,
> + const struct bpf_prog *prog,
> + struct bpf_insn_access_aux *info)
> +{
> + return bpf_tracing_btf_ctx_access(off, size, type, prog, info);
> +}
> +
> +const struct btf_type *fuse_buffer_struct_type;
> +
> +static int bpf_fuse_btf_struct_access(struct bpf_verifier_log *log,
> + const struct bpf_reg_state *reg,
> + int off, int size)
> +{
> + const struct btf_type *t;
> +
> + t = btf_type_by_id(reg->btf, reg->btf_id);
> + if (t == fuse_buffer_struct_type) {
> + bpf_log(log,
> + "direct access to fuse_buffer is disallowed\n");
> + return -EACCES;
> + }
> +
> + return 0;
> +}
> +
> +static const struct bpf_verifier_ops bpf_fuse_verifier_ops = {
> + .get_func_proto = bpf_fuse_get_func_proto,

you probably should be fine with just using bpf_tracing_func_proto as is

> + .is_valid_access = bpf_fuse_is_valid_access,

similarly, why custom no-op callback?

> + .btf_struct_access = bpf_fuse_btf_struct_access,
> +};
> +
> +static int bpf_fuse_check_member(const struct btf_type *t,
> + const struct btf_member *member,
> + const struct bpf_prog *prog)
> +{
> + //if (is_unsupported(__btf_member_bit_offset(t, member) / 8))
> + // return -ENOTSUPP;
> + return 0;
> +}
> +
> +static int bpf_fuse_init_member(const struct btf_type *t,
> + const struct btf_member *member,
> + void *kdata, const void *udata)
> +{
> + const struct fuse_ops *uf_ops;
> + struct fuse_ops *f_ops;
> + u32 moff;
> +
> + uf_ops = (const struct fuse_ops *)udata;
> + f_ops = (struct fuse_ops *)kdata;
> +
> + moff = __btf_member_bit_offset(t, member) / 8;
> + switch (moff) {
> + case offsetof(struct fuse_ops, name):
> + if (bpf_obj_name_cpy(f_ops->name, uf_ops->name,
> + sizeof(f_ops->name)) <= 0)
> + return -EINVAL;
> + //if (tcp_ca_find(utcp_ca->name))
> + // return -EEXIST;
> + return 1;
> + }
> +
> + return 0;
> +}
> +
> +static int bpf_fuse_init(struct btf *btf)
> +{
> + s32 type_id;
> +
> + type_id = btf_find_by_name_kind(btf, "fuse_buffer", BTF_KIND_STRUCT);
> + if (type_id < 0)
> + return -EINVAL;
> + fuse_buffer_struct_type = btf_type_by_id(btf, type_id);
> +

see BTF_ID and BTF_ID_LIST uses for how to get ID for your custom
well-known type

> + return 0;
> +}
> +
> +static struct bpf_fuse_ops_attach *fuse_reg = NULL;
> +

[...]

2023-04-27 04:59:20

by Andrii Nakryiko

[permalink] [raw]
Subject: Re: [RFC PATCH v3 35/37] tools: Add FUSE, update bpf includes

On Mon, Apr 17, 2023 at 6:42 PM Daniel Rosenberg <[email protected]> wrote:
>
> Updates the bpf includes under tools, and adds fuse
>
> Signed-off-by: Daniel Rosenberg <[email protected]>
> ---
> tools/include/uapi/linux/bpf.h | 12 +
> tools/include/uapi/linux/fuse.h | 1135 +++++++++++++++++++++++++++++++
> 2 files changed, 1147 insertions(+)
> create mode 100644 tools/include/uapi/linux/fuse.h
>
> diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h
> index 4b20a7269bee..6521c40875c7 100644
> --- a/tools/include/uapi/linux/bpf.h
> +++ b/tools/include/uapi/linux/bpf.h
> @@ -7155,4 +7155,16 @@ struct bpf_iter_num {
> __u64 __opaque[1];
> } __attribute__((aligned(8)));
>
> +/* Return Codes for Fuse BPF struct_op programs */
> +#define BPF_FUSE_CONTINUE 0
> +#define BPF_FUSE_USER 1
> +#define BPF_FUSE_USER_PREFILTER 2
> +#define BPF_FUSE_POSTFILTER 3
> +#define BPF_FUSE_USER_POSTFILTER 4

nit: can this be an enum instead? It would be more self-documenting,
IMO. At given it's FUSE BPF-specific, why is it not in
uapi/linux/fuse.h?

> +
> +/* Op Code Filter values for BPF Programs */
> +#define FUSE_OPCODE_FILTER 0x0ffff
> +#define FUSE_PREFILTER 0x10000
> +#define FUSE_POSTFILTER 0x20000
> +

[...]

2023-04-28 00:51:11

by Daniel Rosenberg

[permalink] [raw]
Subject: Re: [RFC PATCH v3 35/37] tools: Add FUSE, update bpf includes

On Wed, Apr 26, 2023 at 9:24 PM Andrii Nakryiko
<[email protected]> wrote:
>
> On Mon, Apr 17, 2023 at 6:42 PM Daniel Rosenberg <[email protected]> wrote:
> >
> > +/* Return Codes for Fuse BPF struct_op programs */
> > +#define BPF_FUSE_CONTINUE 0
> > +#define BPF_FUSE_USER 1
> > +#define BPF_FUSE_USER_PREFILTER 2
> > +#define BPF_FUSE_POSTFILTER 3
> > +#define BPF_FUSE_USER_POSTFILTER 4
>
> nit: can this be an enum instead? It would be more self-documenting,
> IMO. At given it's FUSE BPF-specific, why is it not in
> uapi/linux/fuse.h?
>

An enum would be nicer. And I'm sure there are plenty of things that
are probably in the wrong place right now. I'll be moving most of the
changes in bpf specific areas over to fuse specific areas when
struct_ops can handle modules. This particular one can move now
though.

2023-05-02 00:15:48

by Daniel Rosenberg

[permalink] [raw]
Subject: Re: [RFC PATCH bpf-next v3 00/37] FUSE BPF: A Stacked Filesystem Extension for FUSE

On Mon, Apr 24, 2023 at 8:32 AM Miklos Szeredi <[email protected]> wrote:
>
>
> The security model needs to be thought about and documented. Think
> about this: the fuse server now delegates operations it would itself
> perform to the passthrough code in fuse. The permissions that would
> have been checked in the context of the fuse server are now checked in
> the context of the task performing the operation. The server may be
> able to bypass seccomp restrictions. Files that are open on the
> backing filesystem are now hidden (e.g. lsof won't find these), which
> allows the server to obfuscate accesses to backing files. Etc.
>
> These are not particularly worrying if the server is privileged, but
> fuse comes with the history of supporting unprivileged servers, so we
> should look at supporting passthrough with unprivileged servers as
> well.
>

This is on my todo list. My current plan is to grab the creds that the
daemon uses to respond to FUSE_INIT. That should keep behavior fairly
similar. I'm not sure if there are cases where the fuse server is
operating under multiple contexts.
I don't currently have a plan for exposing open files via lsof. Every
such file should relate to one that will show up though. I haven't dug
into how that's set up, but I'm open to suggestions.

> My other generic comment is that you should add justification for
> doing this in the first place. I guess it's mainly performance. So
> how performance can be won in real life cases? It would also be good
> to measure the contribution of individual ops to that win. Is there
> another reason for this besides performance?
>
> Thanks,
> Miklos

Our main concern with it is performance. We have some preliminary
numbers looking at the pure passthrough case. We've been testing using
a ramdrive on a somewhat slow machine, as that should highlight
differences more. We ran fio for sequential reads, and random
read/write. For sequential reads, we were seeing libfuse's
passthrough_hp take about a 50% hit, with fuse-bpf not being
detectably slower. For random read/write, we were seeing a roughly 90%
drop in performance from passthrough_hp, while fuse-bpf has about a 7%
drop in read and write speed. When we use a bpf that traces every
opcode, that performance hit increases to a roughly 1% drop in
sequential read performance, and a 20% drop in both read and write
performance for random read/write. We plan to make more complex bpf
examples, with fuse daemon equivalents to compare against.

We have not looked closely at the impact of individual opcodes yet.

There's also a potential ease of use for fuse-bpf. If you're
implementing a fuse daemon that is largely mirroring a backing
filesystem, you only need to write code for the differences in
behavior. For instance, say you want to remove image metadata like
location. You could give bpf information on what range of data is
metadata, and zero out that section without having to handle any other
operations.

-Daniel

2023-05-02 03:42:02

by Alexei Starovoitov

[permalink] [raw]
Subject: Re: [RFC PATCH v3 08/37] fuse: Add fuse-bpf, a stacked fs extension for FUSE

On Mon, Apr 17, 2023 at 06:40:08PM -0700, Daniel Rosenberg wrote:
> Fuse-bpf provides a short circuit path for Fuse implementations that act
> as a stacked filesystem. For cases that are directly unchanged,
> operations are passed directly to the backing filesystem. Small
> adjustments can be handled by bpf prefilters or postfilters, with the
> option to fall back to userspace as needed.

Here is my understanding of fuse-bpf design:
- bpf progs can mostly read-only access fuse_args before and after proper vfs
operation on a backing path/file/inode.
- args are unconditionally prepared for bpf prog consumption, but progs won't
be doing anything with them most of the time.
- progs unfortunately cannot do any real work. they're nothing but simple filters.
They can give 'green light' for a fuse_FOO op to be delegated to proper vfs_FOO
in backing file. The logic in this patch keeps track of backing_path/file/inode.
- in other words bpf side is "dumb", but it's telling kernel what to do with
real things like path/file/inode and the kernel is doing real work and calling vfs_*.

This design adds non-negligible overhead to fuse when CONFIG_FUSE_BPF is set.
Comparing to trip to user space it's close to zero, but the cost of
initialize_in/out + backing + finalize is not free.
The patch 33 is especially odd.
fuse has a traditional mechanism to upcall to user space with fuse_simple_request.
The patch 33 allows bpf prog to return special return value and trigger two more
fuse_bpf_simple_request-s to user space. Not clear why.
It seems to me that the main assumption of the fuse bpf design is that bpf prog
has to stay short and simple. It cannot do much other than reading and comparing
strings with the help of dynptr.
How about we allow bpf attach to fuse_simple_request and nothing else?
All fuse ops call it anyway and cmd is already encoded in the args.
Then let bpf prog read fuse_args as-is (without converting them to bpf_fuse_args)
and avoid doing actual fuse_req to user space.
Also allow bpf prog acquire and remember path/file/inode.
The verifier is already smart enough to track that the prog is doing it safely
without leaking references and what not.
And, of course, allow bpf prog call vfs_* via kfuncs.
In other words, instead of hard coding
+#define bpf_fuse_backing(inode, io, out, \
+ initialize_in, initialize_out, \
+ backing, finalize, args...) \
one for each fuse_ops in the kernel let bpf prog do the same but on demand.
The biggest advantage is that this patch set instead of 95% on fuse side and 5% on bpf
will become 5% addition to fuse code. All the logic will be handled purely by bpf.
Right now you're limiting it to one backing_file per fuse_file.
With bpf prog driving it the prog can keep multiple backing_files and shuffle
access to them as prog decides.
Instead of doing 'return BPF_FUSE_CONTINUE' the bpf progs will
pass 'path' to kfunc bpf_vfs_open, than stash 'struct bpf_file*', etc.
Probably will be easier to white board this idea during lsfmmbpf.

First 3 patches look fine. Thank you for resending them separately.

2023-05-03 01:58:16

by Daniel Rosenberg

[permalink] [raw]
Subject: Re: [RFC PATCH v3 28/37] WIP: bpf: Add fuse_ops struct_op programs

On Wed, Apr 26, 2023 at 9:18 PM Andrii Nakryiko
<[email protected]> wrote:
>
> Have you considered grouping this huge amount of callbacks into a
> smaller set of more generic callbacks where each callback would get
> enum argument specifying what sort of operation it is called for? This
> has many advantages, starting from not having to deal with struct_ops
> limits, ending with not needing to instantiate dozens of individual
> BPF programs.
>
> E.g., for a lot of operations the difference between pre- and
> post-filter is in having in argument as read-only and maybe having
> extra out argument for post-filter. One way to unify such post/pre
> filters into one callback would be to record whether in has to be
> read-only or read-write and not allow to create r/w dynptr for the
> former case. Pass bool or enum specifying if it's post or pre filter.
> For that optional out argument, you can simulate effectively the same
> by always supplying it, but making sure that out parameter is
> read-only and zero-sized, for example.
>
> That would cut the number of callbacks in two, which I'd say still is
> not great :) I think it would be better still to have even larger
> groups of callbacks for whole families of operations with the same (or
> "unifiable") interface (domain experts like you would need to do an
> analysis here to see what makes sense to group, of course).
>
> We'll probably touch on that tomorrow at BPF office hours, but I
> wanted to point this out beforehand, so that you have time to think
> about it.
>

The meta info struct we pass in includes the opcode which contains
whether it is a prefilter or postfilter, although I guess that may be
less accessible to the verifier than a separate bool. In the v1
version, we handled all op codes in a single program, although I think
we were running into some slowdowns when we had every opcode in a
giant switch statement, plus we were incurring the cost of the bpf
program even when we didn't need to do anything in it. The struct_op
version lets us entirely skip calling the bpf for opcodes we don't
need to handle.

Many of the arguments we pass currently are structs. If they were all
dynptrs, we could set the output related ones to empty/readonly, but
that removes one of the other strengths of the struct_op setup, where
we can actually label the inputs as the structs they are instead of a
void* equivalent. There are definitely some cases where we could
easily merge opcode callbacks, like FUSE_FSYNCDIR/FUSE_FSYNC and
FUSE_OPEN/FUSE_OPENDIR. I set them up as separate since it's easy to
assign the same program to both callbacks in the case where you want
both to be handled the same, while maintaining flexibility to handle
them separately.

> +void bpf_fuse_get_rw_dynptr(struct fuse_buffer *buffer, struct bpf_dynptr_kern *dynptr__uninit, u64 size, bool copy)
>
> not clear why size is passed from outside instead of instantiating
> dynptr with buffer->size? See [0] for bpf_dynptr_adjust and
> bpf_dynptr_clone that allow you to adjust buffer as necessary.
>
> As for the copy parameter, can you elaborate on the idea behind it?
>
> [0] https://patchwork.kernel.org/project/netdevbpf/list/?series=741584&state=*
>

We're storing these buffers as fuse_buffers initially because of the
additional metadata we're carrying. Some fields have variable lengths,
or are backed by const data. For instance, names. If you wanted to
alter the name we use on the lower filesystem, you cannot change it
directly since it's being backed by the dentry name. If you wanted to
adjust something, like perhaps adding an extension, you would pass
bpf_fuse_get_rw_dynptr the size you'd want for the new buffer, and
copy=true to get the preexisting data. Fuse_buffer tracks that data
was allocated so Fuse can clean up after the call. Additionally, say
you wanted to trim half the data returned by an xattr for some reason.
You would give it a size less than the buffer size to inform fuse that
it should ignore the second half of the data. That part could be
handled by bpf_dynptr_adjust if we didn't also need to handle the
allocation case.
Say you wanted to have the lower file name be the hash of the one you
created. In that case, you could get bpf_fuse_get_ro_dynptr to get
access to compute the hash, and then bpf_fuse_get_rw_dynptr to get a
buffer to write the hash to. Since the data is not directly related to
the original data, there would be no benefit to getting a copy.

I initially intended for bpf_fuse_get_ro_dynptr/bpf_fuse_get_rw_dynptr
to be called at most once for each field, but that may be too
restrictive. At the moment, if you make two calls that require
reallocating, any pointers to the old buffer would be invalid. This is
not the case for the original name, as we aren't touching the original
source. There are two possible approaches here. I could either
refcount the buffer and have a put kfunc, or I could invalidate old
dynpointers when bpf_fuse_get_rw_dynptr is called, similar to what
skb/xdp do. I'm leaning towards the latter to disallow having many
allocations active at once by calling bpf_fuse_get_rw_dynptr for
increasing sizes, though I could also just disallow reallocating a
buffer that already was reallocated.

The new dynptr helpers are pretty exciting since they'll make it much
easier to deal with chunks of data, which we may end up doing in
read/write filters. I haven't fully set those up since I was waiting
to see what the dynptr helpers ended up looking like.


> > +void bpf_fuse_get_ro_dynptr(const struct fuse_buffer *buffer, struct bpf_dynptr_kern *dynptr__uninit)
>
> these kfuncs probably should be more consistently named as
> bpf_dynptr_from_fuse_buffer_{ro,rw}() ?
>
Yeah, that fits in much better with the skb/xdp functions.

> > +
> > +uint32_t bpf_fuse_return_len(struct fuse_buffer *buffer)
> > +{
> > + return buffer->size;
>
> you should be able to get this with bpf_dynptr_size() (once you create
> it from fuse_buffer).
>

Yes, this might be unnecessary. I added it while testing kfuncs, and
had intended to use it with a fuse_buffer strncmp before I saw that
there's now a bpf_strncmp :) I had tried using it with
bpf_dynptr_slice, but that requires a known constant at verification
time, which may make using it in real cases a bit difficult...
bpf_strncmp also has some restrictions around the second string being
a fixed map, or something like that.

>
> you probably should be fine with just using bpf_tracing_func_proto as is
>
> > + .is_valid_access = bpf_fuse_is_valid_access,
>
> similarly, why custom no-op callback?
>

Those are largely carried over from iterations when I was less sure
what I would need. A lot of the work I was doing in the v1 code is
handled by default with the struct_op setup now, or is otherwise
unnecessary. This area in particular needs a lot of cleanup.

> > +static int bpf_fuse_init(struct btf *btf)
> > +{
> > + s32 type_id;
> > +
> > + type_id = btf_find_by_name_kind(btf, "fuse_buffer", BTF_KIND_STRUCT);
> > + if (type_id < 0)
> > + return -EINVAL;
> > + fuse_buffer_struct_type = btf_type_by_id(btf, type_id);
> > +
>
> see BTF_ID and BTF_ID_LIST uses for how to get ID for your custom
> well-known type
>
Thanks, I'll look into those.

2023-05-03 03:49:26

by Amir Goldstein

[permalink] [raw]
Subject: Re: [RFC PATCH v3 08/37] fuse: Add fuse-bpf, a stacked fs extension for FUSE

On Tue, May 2, 2023 at 6:38 AM Alexei Starovoitov
<[email protected]> wrote:
>
> On Mon, Apr 17, 2023 at 06:40:08PM -0700, Daniel Rosenberg wrote:
> > Fuse-bpf provides a short circuit path for Fuse implementations that act
> > as a stacked filesystem. For cases that are directly unchanged,
> > operations are passed directly to the backing filesystem. Small
> > adjustments can be handled by bpf prefilters or postfilters, with the
> > option to fall back to userspace as needed.
>
> Here is my understanding of fuse-bpf design:
> - bpf progs can mostly read-only access fuse_args before and after proper vfs
> operation on a backing path/file/inode.
> - args are unconditionally prepared for bpf prog consumption, but progs won't
> be doing anything with them most of the time.
> - progs unfortunately cannot do any real work. they're nothing but simple filters.
> They can give 'green light' for a fuse_FOO op to be delegated to proper vfs_FOO
> in backing file. The logic in this patch keeps track of backing_path/file/inode.
> - in other words bpf side is "dumb", but it's telling kernel what to do with
> real things like path/file/inode and the kernel is doing real work and calling vfs_*.
>
> This design adds non-negligible overhead to fuse when CONFIG_FUSE_BPF is set.
> Comparing to trip to user space it's close to zero, but the cost of
> initialize_in/out + backing + finalize is not free.
> The patch 33 is especially odd.
> fuse has a traditional mechanism to upcall to user space with fuse_simple_request.
> The patch 33 allows bpf prog to return special return value and trigger two more
> fuse_bpf_simple_request-s to user space. Not clear why.
> It seems to me that the main assumption of the fuse bpf design is that bpf prog
> has to stay short and simple. It cannot do much other than reading and comparing
> strings with the help of dynptr.
> How about we allow bpf attach to fuse_simple_request and nothing else?
> All fuse ops call it anyway and cmd is already encoded in the args.
> Then let bpf prog read fuse_args as-is (without converting them to bpf_fuse_args)
> and avoid doing actual fuse_req to user space.
> Also allow bpf prog acquire and remember path/file/inode.
> The verifier is already smart enough to track that the prog is doing it safely
> without leaking references and what not.
> And, of course, allow bpf prog call vfs_* via kfuncs.
> In other words, instead of hard coding
> +#define bpf_fuse_backing(inode, io, out, \
> + initialize_in, initialize_out, \
> + backing, finalize, args...) \
> one for each fuse_ops in the kernel let bpf prog do the same but on demand.
> The biggest advantage is that this patch set instead of 95% on fuse side and 5% on bpf
> will become 5% addition to fuse code. All the logic will be handled purely by bpf.
> Right now you're limiting it to one backing_file per fuse_file.
> With bpf prog driving it the prog can keep multiple backing_files and shuffle
> access to them as prog decides.
> Instead of doing 'return BPF_FUSE_CONTINUE' the bpf progs will
> pass 'path' to kfunc bpf_vfs_open, than stash 'struct bpf_file*', etc.
> Probably will be easier to white board this idea during lsfmmbpf.
>

I have to admit that sounds a bit challenging, but I'm up for sitting
in front of that whiteboard :)

BTW, thanks Daniel (Borkmann) for sorting out the cross track
sessions for FS-BFP.
We have another FS only session on FUSE-BFP, but I feel there is plenty
to discuss on the FUSE-bypass part, as well as on the BPF part.
Same goes for BFP iterators for filesystems session.

Thanks,
Amir.

2023-05-03 18:26:14

by Andrii Nakryiko

[permalink] [raw]
Subject: Re: [RFC PATCH v3 28/37] WIP: bpf: Add fuse_ops struct_op programs

On Tue, May 2, 2023 at 6:53 PM Daniel Rosenberg <[email protected]> wrote:
>
> On Wed, Apr 26, 2023 at 9:18 PM Andrii Nakryiko
> <[email protected]> wrote:
> >
> > Have you considered grouping this huge amount of callbacks into a
> > smaller set of more generic callbacks where each callback would get
> > enum argument specifying what sort of operation it is called for? This
> > has many advantages, starting from not having to deal with struct_ops
> > limits, ending with not needing to instantiate dozens of individual
> > BPF programs.
> >
> > E.g., for a lot of operations the difference between pre- and
> > post-filter is in having in argument as read-only and maybe having
> > extra out argument for post-filter. One way to unify such post/pre
> > filters into one callback would be to record whether in has to be
> > read-only or read-write and not allow to create r/w dynptr for the
> > former case. Pass bool or enum specifying if it's post or pre filter.
> > For that optional out argument, you can simulate effectively the same
> > by always supplying it, but making sure that out parameter is
> > read-only and zero-sized, for example.
> >
> > That would cut the number of callbacks in two, which I'd say still is
> > not great :) I think it would be better still to have even larger
> > groups of callbacks for whole families of operations with the same (or
> > "unifiable") interface (domain experts like you would need to do an
> > analysis here to see what makes sense to group, of course).
> >
> > We'll probably touch on that tomorrow at BPF office hours, but I
> > wanted to point this out beforehand, so that you have time to think
> > about it.
> >
>
> The meta info struct we pass in includes the opcode which contains
> whether it is a prefilter or postfilter, although I guess that may be
> less accessible to the verifier than a separate bool. In the v1
> version, we handled all op codes in a single program, although I think
> we were running into some slowdowns when we had every opcode in a
> giant switch statement, plus we were incurring the cost of the bpf
> program even when we didn't need to do anything in it. The struct_op
> version lets us entirely skip calling the bpf for opcodes we don't
> need to handle.
>
> Many of the arguments we pass currently are structs. If they were all
> dynptrs, we could set the output related ones to empty/readonly, but
> that removes one of the other strengths of the struct_op setup, where
> we can actually label the inputs as the structs they are instead of a
> void* equivalent. There are definitely some cases where we could
> easily merge opcode callbacks, like FUSE_FSYNCDIR/FUSE_FSYNC and
> FUSE_OPEN/FUSE_OPENDIR. I set them up as separate since it's easy to
> assign the same program to both callbacks in the case where you want
> both to be handled the same, while maintaining flexibility to handle
> them separately.

If combining hooks doesn't bring any value and simplification, I think
it's fine to keep it as is. I was mostly probing if there is an
equally convenient, but more succinct API that could be exposed
through struct_ops. If there is none, then it's fine.

>
> > +void bpf_fuse_get_rw_dynptr(struct fuse_buffer *buffer, struct bpf_dynptr_kern *dynptr__uninit, u64 size, bool copy)
> >
> > not clear why size is passed from outside instead of instantiating
> > dynptr with buffer->size? See [0] for bpf_dynptr_adjust and
> > bpf_dynptr_clone that allow you to adjust buffer as necessary.
> >
> > As for the copy parameter, can you elaborate on the idea behind it?
> >
> > [0] https://patchwork.kernel.org/project/netdevbpf/list/?series=741584&state=*
> >
>
> We're storing these buffers as fuse_buffers initially because of the
> additional metadata we're carrying. Some fields have variable lengths,
> or are backed by const data. For instance, names. If you wanted to
> alter the name we use on the lower filesystem, you cannot change it
> directly since it's being backed by the dentry name. If you wanted to
> adjust something, like perhaps adding an extension, you would pass
> bpf_fuse_get_rw_dynptr the size you'd want for the new buffer, and
> copy=true to get the preexisting data. Fuse_buffer tracks that data
> was allocated so Fuse can clean up after the call. Additionally, say
> you wanted to trim half the data returned by an xattr for some reason.
> You would give it a size less than the buffer size to inform fuse that
> it should ignore the second half of the data. That part could be
> handled by bpf_dynptr_adjust if we didn't also need to handle the
> allocation case.

Interesting point about allocations and needing to realloc names. But
I wonder if it makes more sense to split the copy/reallocation part
and do it with separate kfunc. And leave dynptr only as means to work
with that data. So you'd do something like below for read/write case:

bpf_fuse_buf_clone(&buffer, new_size);
bpf_fuse_dynptr_from_buf_rw(&buffer, &dynptr);

But would skip bpf_fuse_buf_clone() if you only ever read:

bpf_fuse_dynptr_from_buf_ro(&buffer, &dynptr);

If fuse_buffer was never cloned/realloced, then
bpf_fuse_dynptr_from_buf_rw() should just fail and return invalid
dynptr.


> Say you wanted to have the lower file name be the hash of the one you
> created. In that case, you could get bpf_fuse_get_ro_dynptr to get
> access to compute the hash, and then bpf_fuse_get_rw_dynptr to get a
> buffer to write the hash to. Since the data is not directly related to
> the original data, there would be no benefit to getting a copy.
>
> I initially intended for bpf_fuse_get_ro_dynptr/bpf_fuse_get_rw_dynptr
> to be called at most once for each field, but that may be too
> restrictive. At the moment, if you make two calls that require
> reallocating, any pointers to the old buffer would be invalid. This is
> not the case for the original name, as we aren't touching the original
> source. There are two possible approaches here. I could either
> refcount the buffer and have a put kfunc, or I could invalidate old
> dynpointers when bpf_fuse_get_rw_dynptr is called, similar to what
> skb/xdp do. I'm leaning towards the latter to disallow having many
> allocations active at once by calling bpf_fuse_get_rw_dynptr for
> increasing sizes, though I could also just disallow reallocating a
> buffer that already was reallocated.

Yes, invalidating dynptrs sounds like a way to go. But I think instead
of bundling all that into dynptr constructor for fuse_buffer, it's
better to have a separate kfunc that would be doing realloc/cloning
*and* invalidating. Other than that, neither from_buf_rw nor
from_buf_ro should be doing invalidation, because they can't cause
realloc. WDYT?

>
> The new dynptr helpers are pretty exciting since they'll make it much
> easier to deal with chunks of data, which we may end up doing in
> read/write filters. I haven't fully set those up since I was waiting
> to see what the dynptr helpers ended up looking like.
>

Great, let us know how it goes in practice to start using them.

>
> > > +void bpf_fuse_get_ro_dynptr(const struct fuse_buffer *buffer, struct bpf_dynptr_kern *dynptr__uninit)
> >
> > these kfuncs probably should be more consistently named as
> > bpf_dynptr_from_fuse_buffer_{ro,rw}() ?
> >
> Yeah, that fits in much better with the skb/xdp functions.

great

>
> > > +
> > > +uint32_t bpf_fuse_return_len(struct fuse_buffer *buffer)
> > > +{
> > > + return buffer->size;
> >
> > you should be able to get this with bpf_dynptr_size() (once you create
> > it from fuse_buffer).
> >
>
> Yes, this might be unnecessary. I added it while testing kfuncs, and
> had intended to use it with a fuse_buffer strncmp before I saw that
> there's now a bpf_strncmp :) I had tried using it with
> bpf_dynptr_slice, but that requires a known constant at verification
> time, which may make using it in real cases a bit difficult...
> bpf_strncmp also has some restrictions around the second string being
> a fixed map, or something like that.

right, we might need a more flexible strncmp version working with two
dynptrs and not assuming a fixed string. We didn't have dynptr
abstraction for working with variable-sized memory when we were adding
bpf_strncmp.

>
> >
> > you probably should be fine with just using bpf_tracing_func_proto as is
> >
> > > + .is_valid_access = bpf_fuse_is_valid_access,
> >
> > similarly, why custom no-op callback?
> >
>
> Those are largely carried over from iterations when I was less sure
> what I would need. A lot of the work I was doing in the v1 code is
> handled by default with the struct_op setup now, or is otherwise
> unnecessary. This area in particular needs a lot of cleanup.
>

ok

> > > +static int bpf_fuse_init(struct btf *btf)
> > > +{
> > > + s32 type_id;
> > > +
> > > + type_id = btf_find_by_name_kind(btf, "fuse_buffer", BTF_KIND_STRUCT);
> > > + if (type_id < 0)
> > > + return -EINVAL;
> > > + fuse_buffer_struct_type = btf_type_by_id(btf, type_id);
> > > +
> >
> > see BTF_ID and BTF_ID_LIST uses for how to get ID for your custom
> > well-known type
> >
> Thanks, I'll look into those.

2023-05-16 20:25:23

by Amir Goldstein

[permalink] [raw]
Subject: Re: [RFC PATCH v3 17/37] fuse-bpf: Add support for read/write iter

On Tue, Apr 18, 2023 at 4:41 AM Daniel Rosenberg <[email protected]> wrote:
>
> Adds backing support for FUSE_READ and FUSE_WRITE
>
> This includes adjustments from Amir Goldstein's patch to FUSE
> Passthrough
>
> Signed-off-by: Daniel Rosenberg <[email protected]>
> Signed-off-by: Paul Lawrence <[email protected]>
> ---
> fs/fuse/backing.c | 371 ++++++++++++++++++++++++++++++++++++++
> fs/fuse/control.c | 2 +-
> fs/fuse/file.c | 8 +
> fs/fuse/fuse_i.h | 19 +-
> fs/fuse/inode.c | 13 ++
> include/uapi/linux/fuse.h | 10 +
> 6 files changed, 421 insertions(+), 2 deletions(-)
>
> diff --git a/fs/fuse/backing.c b/fs/fuse/backing.c
> index c6ef10aeec15..c7709a880e9c 100644
> --- a/fs/fuse/backing.c
> +++ b/fs/fuse/backing.c
> @@ -11,6 +11,7 @@
> #include <linux/file.h>
> #include <linux/fs_stack.h>
> #include <linux/namei.h>
> +#include <linux/uio.h>
>
> /*
> * expression statement to wrap the backing filter logic
> @@ -76,6 +77,89 @@
> handled; \
> })
>
> +#define FUSE_BPF_IOCB_MASK (IOCB_APPEND | IOCB_DSYNC | IOCB_HIPRI | IOCB_NOWAIT | IOCB_SYNC)
> +
> +struct fuse_bpf_aio_req {
> + struct kiocb iocb;
> + refcount_t ref;
> + struct kiocb *iocb_orig;
> + struct timespec64 pre_atime;
> +};
> +
> +static struct kmem_cache *fuse_bpf_aio_request_cachep;
> +
> +static void fuse_file_accessed(struct file *dst_file, struct file *src_file)
> +{
> + struct inode *dst_inode;
> + struct inode *src_inode;
> +
> + if (dst_file->f_flags & O_NOATIME)
> + return;
> +
> + dst_inode = file_inode(dst_file);
> + src_inode = file_inode(src_file);
> +
> + if ((!timespec64_equal(&dst_inode->i_mtime, &src_inode->i_mtime) ||
> + !timespec64_equal(&dst_inode->i_ctime, &src_inode->i_ctime))) {
> + dst_inode->i_mtime = src_inode->i_mtime;
> + dst_inode->i_ctime = src_inode->i_ctime;
> + }
> +
> + touch_atime(&dst_file->f_path);
> +}
> +
> +static void fuse_copyattr(struct file *dst_file, struct file *src_file)
> +{
> + struct inode *dst = file_inode(dst_file);
> + struct inode *src = file_inode(src_file);
> +
> + dst->i_atime = src->i_atime;
> + dst->i_mtime = src->i_mtime;
> + dst->i_ctime = src->i_ctime;
> + i_size_write(dst, i_size_read(src));
> + fuse_invalidate_attr(dst);
> +}
> +
> +static void fuse_file_start_write(struct file *fuse_file, struct file *backing_file,
> + loff_t pos, size_t count)
> +{
> + struct inode *inode = file_inode(fuse_file);
> + struct fuse_inode *fi = get_fuse_inode(inode);
> +
> + if (inode->i_size < pos + count)
> + set_bit(FUSE_I_SIZE_UNSTABLE, &fi->state);
> +
> + file_start_write(backing_file);
> +}
> +
> +static void fuse_file_end_write(struct file *fuse_file, struct file *backing_file,
> + loff_t pos, size_t res)
> +{
> + struct inode *inode = file_inode(fuse_file);
> + struct fuse_inode *fi = get_fuse_inode(inode);
> +
> + file_end_write(backing_file);
> +
> + if (res > 0)
> + fuse_write_update_attr(inode, pos, res);
> +
> + clear_bit(FUSE_I_SIZE_UNSTABLE, &fi->state);
> + fuse_invalidate_attr(inode);

This part is a bit out-of-date (was taken from my old branch)
FWIW, I pushed a more recent version of these patches to:
https://github.com/amir73il/linux/commits/fuse-passthrough-fd
(only compile tested)

> +}
> +
> +static void fuse_file_start_read(struct file *backing_file, struct timespec64 *pre_atime)
> +{
> + *pre_atime = file_inode(backing_file)->i_atime;
> +}
> +
> +static void fuse_file_end_read(struct file *fuse_file, struct file *backing_file,
> + struct timespec64 *pre_atime)
> +{
> + /* Mimic atime update policy of passthrough inode, not the value */
> + if (!timespec64_equal(&file_inode(backing_file)->i_atime, pre_atime))
> + fuse_invalidate_atime(file_inode(fuse_file));
> +}
> +
> static void fuse_get_backing_path(struct file *file, struct path *path)
> {
> path_get(&file->f_path);
> @@ -664,6 +748,277 @@ int fuse_bpf_lseek(loff_t *out, struct inode *inode, struct file *file, loff_t o
> file, offset, whence);
> }
>
> +static inline void fuse_bpf_aio_put(struct fuse_bpf_aio_req *aio_req)
> +{
> + if (refcount_dec_and_test(&aio_req->ref))
> + kmem_cache_free(fuse_bpf_aio_request_cachep, aio_req);
> +}
> +
> +static void fuse_bpf_aio_cleanup_handler(struct fuse_bpf_aio_req *aio_req, long res)
> +{
> + struct kiocb *iocb = &aio_req->iocb;
> + struct kiocb *iocb_orig = aio_req->iocb_orig;
> + struct file *filp = iocb->ki_filp;
> + struct file *fuse_filp = iocb_orig->ki_filp;
> +
> + if (iocb->ki_flags & IOCB_WRITE) {
> + __sb_writers_acquired(file_inode(iocb->ki_filp)->i_sb,
> + SB_FREEZE_WRITE);
> + fuse_file_end_write(iocb_orig->ki_filp, iocb->ki_filp, iocb->ki_pos, res);
> + } else {
> + fuse_file_end_read(fuse_filp, filp, &aio_req->pre_atime);
> + }
> + iocb_orig->ki_pos = iocb->ki_pos;
> + fuse_bpf_aio_put(aio_req);
> +}
> +
> +static void fuse_bpf_aio_rw_complete(struct kiocb *iocb, long res)
> +{
> + struct fuse_bpf_aio_req *aio_req =
> + container_of(iocb, struct fuse_bpf_aio_req, iocb);
> + struct kiocb *iocb_orig = aio_req->iocb_orig;
> +
> + fuse_bpf_aio_cleanup_handler(aio_req, res);
> + iocb_orig->ki_complete(iocb_orig, res);
> +}
> +
> +struct fuse_file_read_iter_args {
> + struct fuse_read_in in;
> + struct fuse_read_iter_out out;
> +};
> +
> +static int fuse_file_read_iter_initialize_in(struct bpf_fuse_args *fa, struct fuse_file_read_iter_args *args,
> + struct kiocb *iocb, struct iov_iter *to)
> +{
> + struct file *file = iocb->ki_filp;
> + struct fuse_file *ff = file->private_data;
> +
> + args->in = (struct fuse_read_in) {
> + .fh = ff->fh,
> + .offset = iocb->ki_pos,
> + .size = to->count,
> + };
> +
> + /* TODO we can't assume 'to' is a kvec */
> + /* TODO we also can't assume the vector has only one component */
> + *fa = (struct bpf_fuse_args) {
> + .info = (struct bpf_fuse_meta_info) {
> + .opcode = FUSE_READ,
> + .nodeid = ff->nodeid,
> + }, .in_numargs = 1,
> + .in_args[0].size = sizeof(args->in),
> + .in_args[0].value = &args->in,
> + /*
> + * TODO Design this properly.
> + * Possible approach: do not pass buf to bpf
> + * If going to userland, do a deep copy
> + * For extra credit, do that to/from the vector, rather than
> + * making an extra copy in the kernel
> + */
> + };
> +
> + return 0;
> +}
> +
> +static int fuse_file_read_iter_initialize_out(struct bpf_fuse_args *fa, struct fuse_file_read_iter_args *args,
> + struct kiocb *iocb, struct iov_iter *to)
> +{
> + args->out = (struct fuse_read_iter_out) {
> + .ret = args->in.size,
> + };
> +
> + fa->out_numargs = 1;
> + fa->out_args[0].size = sizeof(args->out);
> + fa->out_args[0].value = &args->out;
> +
> + return 0;
> +}
> +
> +static int fuse_file_read_iter_backing(struct bpf_fuse_args *fa, ssize_t *out,
> + struct kiocb *iocb, struct iov_iter *to)
> +{
> + struct fuse_read_iter_out *frio = fa->out_args[0].value;
> + struct file *file = iocb->ki_filp;
> + struct fuse_file *ff = file->private_data;
> +
> + if (!iov_iter_count(to))
> + return 0;
> +
> + if ((iocb->ki_flags & IOCB_DIRECT) &&
> + (!ff->backing_file->f_mapping->a_ops ||
> + !ff->backing_file->f_mapping->a_ops->direct_IO))
> + return -EINVAL;
> +
> + /* TODO This just plain ignores any change to fuse_read_in */
> + if (is_sync_kiocb(iocb)) {
> + struct timespec64 pre_atime;
> +
> + fuse_file_start_read(ff->backing_file, &pre_atime);
> + *out = vfs_iter_read(ff->backing_file, to, &iocb->ki_pos,
> + iocb_to_rw_flags(iocb->ki_flags, FUSE_BPF_IOCB_MASK));
> + fuse_file_end_read(file, ff->backing_file, &pre_atime);
> + } else {
> + struct fuse_bpf_aio_req *aio_req;
> +
> + *out = -ENOMEM;
> + aio_req = kmem_cache_zalloc(fuse_bpf_aio_request_cachep, GFP_KERNEL);
> + if (!aio_req)
> + goto out;
> +
> + aio_req->iocb_orig = iocb;
> + fuse_file_start_read(ff->backing_file, &aio_req->pre_atime);
> + kiocb_clone(&aio_req->iocb, iocb, ff->backing_file);
> + aio_req->iocb.ki_complete = fuse_bpf_aio_rw_complete;
> + refcount_set(&aio_req->ref, 2);
> + *out = vfs_iocb_iter_read(ff->backing_file, &aio_req->iocb, to);
> + fuse_bpf_aio_put(aio_req);
> + if (*out != -EIOCBQUEUED)
> + fuse_bpf_aio_cleanup_handler(aio_req, *out);
> + }
> +
> + frio->ret = *out;
> +
> + /* TODO Need to point value at the buffer for post-modification */
> +
> +out:
> + fuse_file_accessed(file, ff->backing_file);

fuse_file_accessed() looks redundant and less subtle what
fuse_file_end_read() already does.

> +
> + return *out;
> +}
> +
> +static int fuse_file_read_iter_finalize(struct bpf_fuse_args *fa, ssize_t *out,
> + struct kiocb *iocb, struct iov_iter *to)
> +{
> + struct fuse_read_iter_out *frio = fa->out_args[0].value;
> +
> + *out = frio->ret;
> +
> + return 0;
> +}
> +
> +int fuse_bpf_file_read_iter(ssize_t *out, struct inode *inode, struct kiocb *iocb, struct iov_iter *to)
> +{
> + return bpf_fuse_backing(inode, struct fuse_file_read_iter_args, out,
> + fuse_file_read_iter_initialize_in,
> + fuse_file_read_iter_initialize_out,
> + fuse_file_read_iter_backing,
> + fuse_file_read_iter_finalize,
> + iocb, to);
> +}
> +
> +struct fuse_file_write_iter_args {
> + struct fuse_write_in in;
> + struct fuse_write_iter_out out;
> +};
> +
> +static int fuse_file_write_iter_initialize_in(struct bpf_fuse_args *fa,
> + struct fuse_file_write_iter_args *args,
> + struct kiocb *iocb, struct iov_iter *from)
> +{
> + struct file *file = iocb->ki_filp;
> + struct fuse_file *ff = file->private_data;
> +
> + *args = (struct fuse_file_write_iter_args) {
> + .in.fh = ff->fh,
> + .in.offset = iocb->ki_pos,
> + .in.size = from->count,
> + };
> +
> + /* TODO we can't assume 'from' is a kvec */
> + *fa = (struct bpf_fuse_args) {
> + .info = (struct bpf_fuse_meta_info) {
> + .opcode = FUSE_WRITE,
> + .nodeid = ff->nodeid,
> + },
> + .in_numargs = 1,
> + .in_args[0].size = sizeof(args->in),
> + .in_args[0].value = &args->in,
> + };
> +
> + return 0;
> +}
> +
> +static int fuse_file_write_iter_initialize_out(struct bpf_fuse_args *fa,
> + struct fuse_file_write_iter_args *args,
> + struct kiocb *iocb, struct iov_iter *from)
> +{
> + /* TODO we can't assume 'from' is a kvec */
> + fa->out_numargs = 1;
> + fa->out_args[0].size = sizeof(args->out);
> + fa->out_args[0].value = &args->out;
> +
> + return 0;
> +}
> +
> +static int fuse_file_write_iter_backing(struct bpf_fuse_args *fa, ssize_t *out,
> + struct kiocb *iocb, struct iov_iter *from)
> +{
> + struct file *file = iocb->ki_filp;
> + struct fuse_file *ff = file->private_data;
> + struct fuse_write_iter_out *fwio = fa->out_args[0].value;
> + ssize_t count = iov_iter_count(from);
> +
> + if (!count)
> + return 0;
> +
> + /* TODO This just plain ignores any change to fuse_write_in */
> + /* TODO uint32_t seems smaller than ssize_t.... right? */
> + inode_lock(file_inode(file));
> +
> + fuse_copyattr(file, ff->backing_file);

fuse_copyattr() looks redundant and less subtle than what
fuse_file_end_write() already does.

Thanks,
Amir.

2023-05-17 03:12:53

by Gao Xiang

[permalink] [raw]
Subject: Re: [RFC PATCH bpf-next v3 00/37] FUSE BPF: A Stacked Filesystem Extension for FUSE



On 2023/5/2 17:07, Daniel Rosenberg wrote:
> On Mon, Apr 24, 2023 at 8:32 AM Miklos Szeredi <[email protected]> wrote:
>>
>>
>> The security model needs to be thought about and documented. Think
>> about this: the fuse server now delegates operations it would itself
>> perform to the passthrough code in fuse. The permissions that would
>> have been checked in the context of the fuse server are now checked in
>> the context of the task performing the operation. The server may be
>> able to bypass seccomp restrictions. Files that are open on the
>> backing filesystem are now hidden (e.g. lsof won't find these), which
>> allows the server to obfuscate accesses to backing files. Etc.
>>
>> These are not particularly worrying if the server is privileged, but
>> fuse comes with the history of supporting unprivileged servers, so we
>> should look at supporting passthrough with unprivileged servers as
>> well.
>>
>
> This is on my todo list. My current plan is to grab the creds that the
> daemon uses to respond to FUSE_INIT. That should keep behavior fairly
> similar. I'm not sure if there are cases where the fuse server is
> operating under multiple contexts.
> I don't currently have a plan for exposing open files via lsof. Every
> such file should relate to one that will show up though. I haven't dug
> into how that's set up, but I'm open to suggestions.
>
>> My other generic comment is that you should add justification for
>> doing this in the first place. I guess it's mainly performance. So
>> how performance can be won in real life cases? It would also be good
>> to measure the contribution of individual ops to that win. Is there
>> another reason for this besides performance?
>>
>> Thanks,
>> Miklos
>
> Our main concern with it is performance. We have some preliminary
> numbers looking at the pure passthrough case. We've been testing using
> a ramdrive on a somewhat slow machine, as that should highlight
> differences more. We ran fio for sequential reads, and random
> read/write. For sequential reads, we were seeing libfuse's
> passthrough_hp take about a 50% hit, with fuse-bpf not being
> detectably slower. For random read/write, we were seeing a roughly 90%
> drop in performance from passthrough_hp, while fuse-bpf has about a 7%
> drop in read and write speed. When we use a bpf that traces every
> opcode, that performance hit increases to a roughly 1% drop in
> sequential read performance, and a 20% drop in both read and write
> performance for random read/write. We plan to make more complex bpf
> examples, with fuse daemon equivalents to compare against.
>
> We have not looked closely at the impact of individual opcodes yet.
>
> There's also a potential ease of use for fuse-bpf. If you're
> implementing a fuse daemon that is largely mirroring a backing
> filesystem, you only need to write code for the differences in
> behavior. For instance, say you want to remove image metadata like
> location. You could give bpf information on what range of data is
> metadata, and zero out that section without having to handle any other
> operations.

A bit out of topic (although I'm not quite look into FUSE BPF internals)
After roughly listening to this topic in FS track last week, I'm not
quite sure (at least in the long term) if it might be better if
ebpf-related filter/redirect stuffs could be landed in vfs or in a
somewhat stackable fs so that we could redirect/filter any sub-fstree
in principle? It's just an open question and I have no real tendency
of this but do we really need a BPF-filter functionality for each
individual fs?

It sounds much like
https://learn.microsoft.com/en-us/windows-hardware/drivers/ifs/about-file-system-filter-drivers

Thanks,
Gao Xiang

>
> -Daniel

2023-05-17 07:05:08

by Amir Goldstein

[permalink] [raw]
Subject: Re: [RFC PATCH bpf-next v3 00/37] FUSE BPF: A Stacked Filesystem Extension for FUSE

On Wed, May 17, 2023 at 5:50 AM Gao Xiang <[email protected]> wrote:
>
>
>
> On 2023/5/2 17:07, Daniel Rosenberg wrote:
> > On Mon, Apr 24, 2023 at 8:32 AM Miklos Szeredi <[email protected]> wrote:
> >>
> >>
> >> The security model needs to be thought about and documented. Think
> >> about this: the fuse server now delegates operations it would itself
> >> perform to the passthrough code in fuse. The permissions that would
> >> have been checked in the context of the fuse server are now checked in
> >> the context of the task performing the operation. The server may be
> >> able to bypass seccomp restrictions. Files that are open on the
> >> backing filesystem are now hidden (e.g. lsof won't find these), which
> >> allows the server to obfuscate accesses to backing files. Etc.
> >>
> >> These are not particularly worrying if the server is privileged, but
> >> fuse comes with the history of supporting unprivileged servers, so we
> >> should look at supporting passthrough with unprivileged servers as
> >> well.
> >>
> >
> > This is on my todo list. My current plan is to grab the creds that the
> > daemon uses to respond to FUSE_INIT. That should keep behavior fairly
> > similar. I'm not sure if there are cases where the fuse server is
> > operating under multiple contexts.
> > I don't currently have a plan for exposing open files via lsof. Every
> > such file should relate to one that will show up though. I haven't dug
> > into how that's set up, but I'm open to suggestions.
> >
> >> My other generic comment is that you should add justification for
> >> doing this in the first place. I guess it's mainly performance. So
> >> how performance can be won in real life cases? It would also be good
> >> to measure the contribution of individual ops to that win. Is there
> >> another reason for this besides performance?
> >>
> >> Thanks,
> >> Miklos
> >
> > Our main concern with it is performance. We have some preliminary
> > numbers looking at the pure passthrough case. We've been testing using
> > a ramdrive on a somewhat slow machine, as that should highlight
> > differences more. We ran fio for sequential reads, and random
> > read/write. For sequential reads, we were seeing libfuse's
> > passthrough_hp take about a 50% hit, with fuse-bpf not being
> > detectably slower. For random read/write, we were seeing a roughly 90%
> > drop in performance from passthrough_hp, while fuse-bpf has about a 7%
> > drop in read and write speed. When we use a bpf that traces every
> > opcode, that performance hit increases to a roughly 1% drop in
> > sequential read performance, and a 20% drop in both read and write
> > performance for random read/write. We plan to make more complex bpf
> > examples, with fuse daemon equivalents to compare against.
> >
> > We have not looked closely at the impact of individual opcodes yet.
> >
> > There's also a potential ease of use for fuse-bpf. If you're
> > implementing a fuse daemon that is largely mirroring a backing
> > filesystem, you only need to write code for the differences in
> > behavior. For instance, say you want to remove image metadata like
> > location. You could give bpf information on what range of data is
> > metadata, and zero out that section without having to handle any other
> > operations.
>
> A bit out of topic (although I'm not quite look into FUSE BPF internals)
> After roughly listening to this topic in FS track last week, I'm not
> quite sure (at least in the long term) if it might be better if
> ebpf-related filter/redirect stuffs could be landed in vfs or in a
> somewhat stackable fs so that we could redirect/filter any sub-fstree
> in principle? It's just an open question and I have no real tendency
> of this but do we really need a BPF-filter functionality for each
> individual fs?

I think that is a valid question, but the answer is that even if it makes sense,
doing something like this in vfs would be a much bigger project with larger
consequences on performance and security and whatnot, so even if
(and a very big if) this ever happens, using FUSE-BPF as a playground for
this sort of stuff would be a good idea.

This reminds me of union mounts - it made sense to have union mount
functionality in vfs, but after a long winding road, a stacked fs (overlayfs)
turned out to be a much more practical solution.

>
> It sounds much like
> https://learn.microsoft.com/en-us/windows-hardware/drivers/ifs/about-file-system-filter-drivers
>

Nice reference.
I must admit that I found it hard to understand what Windows filter drivers
can do compared to FUSE-BPF design.
It'd be nice to get some comparison from what is planned for FUSE-BPF.

Interesting to note that there is a "legacy" Windows filter driver API,
so Windows didn't get everything right for the first API - that is especially
interesting to look at as repeating other people's mistakes would be a shame.

Thanks,
Amir.

2023-05-17 07:22:20

by Gao Xiang

[permalink] [raw]
Subject: Re: [RFC PATCH bpf-next v3 00/37] FUSE BPF: A Stacked Filesystem Extension for FUSE

Hi Amir,

On 2023/5/17 23:51, Amir Goldstein wrote:
> On Wed, May 17, 2023 at 5:50 AM Gao Xiang <[email protected]> wrote:
>>
>>
>>
>> On 2023/5/2 17:07, Daniel Rosenberg wrote:
>>> On Mon, Apr 24, 2023 at 8:32 AM Miklos Szeredi <[email protected]> wrote:
>>>>
>>>>
>>>> The security model needs to be thought about and documented. Think
>>>> about this: the fuse server now delegates operations it would itself
>>>> perform to the passthrough code in fuse. The permissions that would
>>>> have been checked in the context of the fuse server are now checked in
>>>> the context of the task performing the operation. The server may be
>>>> able to bypass seccomp restrictions. Files that are open on the
>>>> backing filesystem are now hidden (e.g. lsof won't find these), which
>>>> allows the server to obfuscate accesses to backing files. Etc.
>>>>
>>>> These are not particularly worrying if the server is privileged, but
>>>> fuse comes with the history of supporting unprivileged servers, so we
>>>> should look at supporting passthrough with unprivileged servers as
>>>> well.
>>>>
>>>
>>> This is on my todo list. My current plan is to grab the creds that the
>>> daemon uses to respond to FUSE_INIT. That should keep behavior fairly
>>> similar. I'm not sure if there are cases where the fuse server is
>>> operating under multiple contexts.
>>> I don't currently have a plan for exposing open files via lsof. Every
>>> such file should relate to one that will show up though. I haven't dug
>>> into how that's set up, but I'm open to suggestions.
>>>
>>>> My other generic comment is that you should add justification for
>>>> doing this in the first place. I guess it's mainly performance. So
>>>> how performance can be won in real life cases? It would also be good
>>>> to measure the contribution of individual ops to that win. Is there
>>>> another reason for this besides performance?
>>>>
>>>> Thanks,
>>>> Miklos
>>>
>>> Our main concern with it is performance. We have some preliminary
>>> numbers looking at the pure passthrough case. We've been testing using
>>> a ramdrive on a somewhat slow machine, as that should highlight
>>> differences more. We ran fio for sequential reads, and random
>>> read/write. For sequential reads, we were seeing libfuse's
>>> passthrough_hp take about a 50% hit, with fuse-bpf not being
>>> detectably slower. For random read/write, we were seeing a roughly 90%
>>> drop in performance from passthrough_hp, while fuse-bpf has about a 7%
>>> drop in read and write speed. When we use a bpf that traces every
>>> opcode, that performance hit increases to a roughly 1% drop in
>>> sequential read performance, and a 20% drop in both read and write
>>> performance for random read/write. We plan to make more complex bpf
>>> examples, with fuse daemon equivalents to compare against.
>>>
>>> We have not looked closely at the impact of individual opcodes yet.
>>>
>>> There's also a potential ease of use for fuse-bpf. If you're
>>> implementing a fuse daemon that is largely mirroring a backing
>>> filesystem, you only need to write code for the differences in
>>> behavior. For instance, say you want to remove image metadata like
>>> location. You could give bpf information on what range of data is
>>> metadata, and zero out that section without having to handle any other
>>> operations.
>>
>> A bit out of topic (although I'm not quite look into FUSE BPF internals)
>> After roughly listening to this topic in FS track last week, I'm not
>> quite sure (at least in the long term) if it might be better if
>> ebpf-related filter/redirect stuffs could be landed in vfs or in a
>> somewhat stackable fs so that we could redirect/filter any sub-fstree
>> in principle? It's just an open question and I have no real tendency
>> of this but do we really need a BPF-filter functionality for each
>> individual fs?
>
> I think that is a valid question, but the answer is that even if it makes sense,
> doing something like this in vfs would be a much bigger project with larger
> consequences on performance and security and whatnot, so even if
> (and a very big if) this ever happens, using FUSE-BPF as a playground for
> this sort of stuff would be a good idea.

My current observation is that the total Fuse-BPF LoC is already beyond the
whole FUSE itself. In addition, it almost hooks all fs operations which
impacts something to me.

>
> This reminds me of union mounts - it made sense to have union mount
> functionality in vfs, but after a long winding road, a stacked fs (overlayfs)
> turned out to be a much more practical solution.

Yeah, I agree. So it was just a pure hint on my side.

>
>>
>> It sounds much like
>> https://learn.microsoft.com/en-us/windows-hardware/drivers/ifs/about-file-system-filter-drivers
>>
>
> Nice reference.
> I must admit that I found it hard to understand what Windows filter drivers
> can do compared to FUSE-BPF design.
> It'd be nice to get some comparison from what is planned for FUSE-BPF.

At least some investigation/analysis first might be better in the long
term development.

>
> Interesting to note that there is a "legacy" Windows filter driver API,
> so Windows didn't get everything right for the first API - that is especially
> interesting to look at as repeating other people's mistakes would be a shame.

I'm not familiar with that details as well, yet I saw that they have a
filesystem filter subsystem, so I mentioned it here.

Thanks,
Gao Xiang

>
> Thanks,
> Amir.

2023-05-17 07:44:00

by Gao Xiang

[permalink] [raw]
Subject: Re: [RFC PATCH bpf-next v3 00/37] FUSE BPF: A Stacked Filesystem Extension for FUSE



On 2023/5/17 00:05, Gao Xiang wrote:
> Hi Amir,
>
> On 2023/5/17 23:51, Amir Goldstein wrote:
>> On Wed, May 17, 2023 at 5:50 AM Gao Xiang <[email protected]> wrote:
>>>
>>>
>>>
>>> On 2023/5/2 17:07, Daniel Rosenberg wrote:
>>>> On Mon, Apr 24, 2023 at 8:32 AM Miklos Szeredi <[email protected]> wrote:
>>>>>
>>>>>
>>>>> The security model needs to be thought about and documented.  Think
>>>>> about this: the fuse server now delegates operations it would itself
>>>>> perform to the passthrough code in fuse.  The permissions that would
>>>>> have been checked in the context of the fuse server are now checked in
>>>>> the context of the task performing the operation.  The server may be
>>>>> able to bypass seccomp restrictions.  Files that are open on the
>>>>> backing filesystem are now hidden (e.g. lsof won't find these), which
>>>>> allows the server to obfuscate accesses to backing files.  Etc.
>>>>>
>>>>> These are not particularly worrying if the server is privileged, but
>>>>> fuse comes with the history of supporting unprivileged servers, so we
>>>>> should look at supporting passthrough with unprivileged servers as
>>>>> well.
>>>>>
>>>>
>>>> This is on my todo list. My current plan is to grab the creds that the
>>>> daemon uses to respond to FUSE_INIT. That should keep behavior fairly
>>>> similar. I'm not sure if there are cases where the fuse server is
>>>> operating under multiple contexts.
>>>> I don't currently have a plan for exposing open files via lsof. Every
>>>> such file should relate to one that will show up though. I haven't dug
>>>> into how that's set up, but I'm open to suggestions.
>>>>
>>>>> My other generic comment is that you should add justification for
>>>>> doing this in the first place.  I guess it's mainly performance.  So
>>>>> how performance can be won in real life cases?   It would also be good
>>>>> to measure the contribution of individual ops to that win.   Is there
>>>>> another reason for this besides performance?
>>>>>
>>>>> Thanks,
>>>>> Miklos
>>>>
>>>> Our main concern with it is performance. We have some preliminary
>>>> numbers looking at the pure passthrough case. We've been testing using
>>>> a ramdrive on a somewhat slow machine, as that should highlight
>>>> differences more. We ran fio for sequential reads, and random
>>>> read/write. For sequential reads, we were seeing libfuse's
>>>> passthrough_hp take about a 50% hit, with fuse-bpf not being
>>>> detectably slower. For random read/write, we were seeing a roughly 90%
>>>> drop in performance from passthrough_hp, while fuse-bpf has about a 7%
>>>> drop in read and write speed. When we use a bpf that traces every
>>>> opcode, that performance hit increases to a roughly 1% drop in
>>>> sequential read performance, and a 20% drop in both read and write
>>>> performance for random read/write. We plan to make more complex bpf
>>>> examples, with fuse daemon equivalents to compare against.
>>>>
>>>> We have not looked closely at the impact of individual opcodes yet.
>>>>
>>>> There's also a potential ease of use for fuse-bpf. If you're
>>>> implementing a fuse daemon that is largely mirroring a backing
>>>> filesystem, you only need to write code for the differences in
>>>> behavior. For instance, say you want to remove image metadata like
>>>> location. You could give bpf information on what range of data is
>>>> metadata, and zero out that section without having to handle any other
>>>> operations.
>>>
>>> A bit out of topic (although I'm not quite look into FUSE BPF internals)
>>> After roughly listening to this topic in FS track last week, I'm not
>>> quite sure (at least in the long term) if it might be better if
>>> ebpf-related filter/redirect stuffs could be landed in vfs or in a
>>> somewhat stackable fs so that we could redirect/filter any sub-fstree
>>> in principle?    It's just an open question and I have no real tendency
>>> of this but do we really need a BPF-filter functionality for each
>>> individual fs?
>>
>> I think that is a valid question, but the answer is that even if it makes sense,
>> doing something like this in vfs would be a much bigger project with larger
>> consequences on performance and security and whatnot, so even if
>> (and a very big if) this ever happens, using FUSE-BPF as a playground for
>> this sort of stuff would be a good idea.
>
> My current observation is that the total Fuse-BPF LoC is already beyond the


^ sorry I double-checked now I was wrong, forget about it.

> whole FUSE itself.  In addition, it almost hooks all fs operations which
> impacts something to me.
>
>>
>> This reminds me of union mounts - it made sense to have union mount
>> functionality in vfs, but after a long winding road, a stacked fs (overlayfs)
>> turned out to be a much more practical solution.
>
> Yeah, I agree.  So it was just a pure hint on my side.
>
>>
>>>
>>> It sounds much like
>>> https://learn.microsoft.com/en-us/windows-hardware/drivers/ifs/about-file-system-filter-drivers
>>>
>>
>> Nice reference.
>> I must admit that I found it hard to understand what Windows filter drivers
>> can do compared to FUSE-BPF design.
>> It'd be nice to get some comparison from what is planned for FUSE-BPF.
>
> At least some investigation/analysis first might be better in the long
> term development.
>
>>
>> Interesting to note that there is a "legacy" Windows filter driver API,
>> so Windows didn't get everything right for the first API - that is especially
>> interesting to look at as repeating other people's mistakes would be a shame.
>
> I'm not familiar with that details as well, yet I saw that they have a
> filesystem filter subsystem, so I mentioned it here.
>
> Thanks,
> Gao Xiang
>
>>
>> Thanks,
>> Amir.