Hi,
so after the reviews from v3, and some discussion with Alexei, I am
back with a new version of HID-BPF.
This version is not complete (thus the RFC), but I'd like to share
it now to get initial feedback, in case I am too far from the actual
goal.
FTR, the goal is to provide some changes in the core verifier/btf so
that we can plug in HID-BPF independently from BPF core. This way we can
extend it without having to care about bpf-next.
The things I am not entirely sure are:
- do we need only fentry/fexit/fmod_ret BPF program types or should
programs that modify the data stream use a different kind?
- patch 3/7 is probably not the correct approach (see comments in the
patch itself)
We are missing quite a few bits here:
- selftests for patches 1 to 4
- add the ability to attach a program to a struct device, and run that
program only for that struct device
- when running through bpf_prog_test_run_opts, how can we ensure we are
talking to the correct device? (I have a feeling this is linked to the
previous point)
- how can we reconnect the device when a report descriptor fixup BPF
program is loaded (would it make sense to allow some notifications on
when a BPF program is attached/detached to a device, and which
function have been traced?)
Cheers,
Benjamin
Benjamin Tissoires (7):
bpf/btf: also allow kfunc in tracing programs
bpf/verifier: allow kfunc to return an allocated mem
error-inject: add new type that carries if the function is non
sleepable
btf: Add a new kfunc set which allows to mark a function to be
sleepable
HID: initial BPF new way implementation
samples/bpf: add new hid_mouse example
selftests/bpf: add tests for the HID-bpf initial implementation
drivers/hid/hid-core.c | 115 +++++
include/asm-generic/error-injection.h | 1 +
include/linux/btf.h | 8 +
include/linux/hid_bpf.h | 29 ++
kernel/bpf/btf.c | 50 +-
kernel/bpf/verifier.c | 76 ++-
lib/error-inject.c | 2 +
samples/bpf/.gitignore | 1 +
samples/bpf/Makefile | 23 +
samples/bpf/hid_mouse.bpf.c | 59 +++
samples/bpf/hid_mouse.c | 131 +++++
tools/testing/selftests/bpf/config | 3 +
tools/testing/selftests/bpf/prog_tests/hid.c | 482 +++++++++++++++++++
tools/testing/selftests/bpf/progs/hid.c | 32 ++
14 files changed, 987 insertions(+), 25 deletions(-)
create mode 100644 include/linux/hid_bpf.h
create mode 100644 samples/bpf/hid_mouse.bpf.c
create mode 100644 samples/bpf/hid_mouse.c
create mode 100644 tools/testing/selftests/bpf/prog_tests/hid.c
create mode 100644 tools/testing/selftests/bpf/progs/hid.c
--
2.35.1
When a kfunc is not returning a pointer to a struct but to a plain type,
check if one of the arguments is called __sz and is a const from the
caller, and use this as the size of the allocated memory.
For tracing programs, we consider the provided memory to be read only
unless the program is BPF_MODIFY_RETURN.
Signed-off-by: Benjamin Tissoires <[email protected]>
---
new in v4
---
include/linux/btf.h | 6 ++++
kernel/bpf/btf.c | 31 ++++++++++++++++----
kernel/bpf/verifier.c | 66 +++++++++++++++++++++++++++++++++----------
3 files changed, 83 insertions(+), 20 deletions(-)
diff --git a/include/linux/btf.h b/include/linux/btf.h
index 36bc09b8e890..76a3ff48ae2a 100644
--- a/include/linux/btf.h
+++ b/include/linux/btf.h
@@ -332,6 +332,12 @@ static inline struct btf_param *btf_params(const struct btf_type *t)
return (struct btf_param *)(t + 1);
}
+struct bpf_reg_state;
+
+bool btf_is_kfunc_arg_mem_size(const struct btf *btf,
+ const struct btf_param *arg,
+ const struct bpf_reg_state *reg);
+
#ifdef CONFIG_BPF_SYSCALL
struct bpf_prog;
diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c
index 76318a4c2d0e..22e6e3cdc7ee 100644
--- a/kernel/bpf/btf.c
+++ b/kernel/bpf/btf.c
@@ -5851,9 +5851,9 @@ static bool __btf_type_is_scalar_struct(struct bpf_verifier_log *log,
return true;
}
-static bool is_kfunc_arg_mem_size(const struct btf *btf,
- const struct btf_param *arg,
- const struct bpf_reg_state *reg)
+bool btf_is_kfunc_arg_mem_size(const struct btf *btf,
+ const struct btf_param *arg,
+ const struct bpf_reg_state *reg)
{
int len, sfx_len = sizeof("__sz") - 1;
const struct btf_type *t;
@@ -5976,7 +5976,7 @@ static int btf_check_func_arg_match(struct bpf_verifier_env *env,
reg_btf = reg->btf;
reg_ref_id = reg->btf_id;
/* Ensure only one argument is referenced
- * PTR_TO_BTF_ID, check_func_arg_reg_off relies
+ * PTR_TO_BTF_ID or PTR_TO_MEM, check_func_arg_reg_off relies
* on only one referenced register being allowed
* for kfuncs.
*/
@@ -6012,7 +6012,10 @@ static int btf_check_func_arg_match(struct bpf_verifier_env *env,
u32 type_size;
if (is_kfunc) {
- bool arg_mem_size = i + 1 < nargs && is_kfunc_arg_mem_size(btf, &args[i + 1], ®s[regno + 1]);
+ bool arg_mem_size = i + 1 < nargs &&
+ btf_is_kfunc_arg_mem_size(btf,
+ &args[i + 1],
+ ®s[regno + 1]);
/* Permit pointer to mem, but only when argument
* type is pointer to scalar, or struct composed
@@ -6039,6 +6042,24 @@ static int btf_check_func_arg_match(struct bpf_verifier_env *env,
i++;
continue;
}
+
+ if (rel && reg->ref_obj_id) {
+ /* Ensure only one argument is referenced
+ * PTR_TO_BTF_ID or PTR_TO_MEM, check_func_arg_reg_off
+ * relies on only one referenced register being allowed
+ * for kfuncs.
+ */
+ if (ref_obj_id) {
+ bpf_log(log,
+ "verifier internal error: more than one arg with ref_obj_id R%d %u %u\n",
+ regno,
+ reg->ref_obj_id,
+ ref_obj_id);
+ return -EFAULT;
+ }
+ ref_regno = regno;
+ ref_obj_id = reg->ref_obj_id;
+ }
}
resolve_ret = btf_resolve_size(btf, ref_t, &type_size);
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 71827d14724a..0f339f9058f3 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -6974,7 +6974,9 @@ static int check_kfunc_call(struct bpf_verifier_env *env, struct bpf_insn *insn,
int err, insn_idx = *insn_idx_p;
const struct btf_param *args;
struct btf *desc_btf;
+ enum bpf_prog_type prog_type = resolve_prog_type(env->prog);
bool acq;
+ size_t reg_size = 0;
/* skip for now, but return error when we find this in fixup_kfunc_call */
if (!insn->imm)
@@ -7015,8 +7017,8 @@ static int check_kfunc_call(struct bpf_verifier_env *env, struct bpf_insn *insn,
}
}
- for (i = 0; i < CALLER_SAVED_REGS; i++)
- mark_reg_not_init(env, regs, caller_saved[i]);
+ /* reset REG_0 */
+ mark_reg_not_init(env, regs, BPF_REG_0);
/* Check return type */
t = btf_type_skip_modifiers(desc_btf, func_proto->type, NULL);
@@ -7026,6 +7028,9 @@ static int check_kfunc_call(struct bpf_verifier_env *env, struct bpf_insn *insn,
return -EINVAL;
}
+ nargs = btf_type_vlen(func_proto);
+ args = btf_params(func_proto);
+
if (btf_type_is_scalar(t)) {
mark_reg_unknown(env, regs, BPF_REG_0);
mark_btf_func_reg_size(env, BPF_REG_0, t->size);
@@ -7033,24 +7038,54 @@ static int check_kfunc_call(struct bpf_verifier_env *env, struct bpf_insn *insn,
ptr_type = btf_type_skip_modifiers(desc_btf, t->type,
&ptr_type_id);
if (!btf_type_is_struct(ptr_type)) {
- ptr_type_name = btf_name_by_offset(desc_btf,
- ptr_type->name_off);
- verbose(env, "kernel function %s returns pointer type %s %s is not supported\n",
- func_name, btf_type_str(ptr_type),
- ptr_type_name);
- return -EINVAL;
+ /* if we have an array, we must have a const argument named "__sz" */
+ for (i = 0; i < nargs; i++) {
+ u32 regno = i + BPF_REG_1;
+ struct bpf_reg_state *reg = ®s[regno];
+
+ /* look for any const scalar parameter of name "__sz" */
+ if (!check_reg_arg(env, regno, SRC_OP) &&
+ tnum_is_const(regs[regno].var_off) &&
+ btf_is_kfunc_arg_mem_size(desc_btf, &args[i], reg))
+ reg_size = regs[regno].var_off.value;
+ }
+
+ if (!reg_size) {
+ ptr_type_name = btf_name_by_offset(desc_btf,
+ ptr_type->name_off);
+ verbose(env,
+ "kernel function %s returns pointer type %s %s is not supported\n",
+ func_name,
+ btf_type_str(ptr_type),
+ ptr_type_name);
+ return -EINVAL;
+ }
+
+ mark_reg_known_zero(env, regs, BPF_REG_0);
+ regs[BPF_REG_0].type = PTR_TO_MEM;
+ regs[BPF_REG_0].mem_size = reg_size;
+
+ /* in case of tracing, only allow write access to
+ * BPF_MODIFY_RETURN programs
+ */
+ if (prog_type == BPF_PROG_TYPE_TRACING &&
+ env->prog->expected_attach_type != BPF_MODIFY_RETURN)
+ regs[BPF_REG_0].type |= MEM_RDONLY;
+ } else {
+ mark_reg_known_zero(env, regs, BPF_REG_0);
+ regs[BPF_REG_0].type = PTR_TO_BTF_ID;
+ regs[BPF_REG_0].btf = desc_btf;
+ regs[BPF_REG_0].btf_id = ptr_type_id;
+ mark_btf_func_reg_size(env, BPF_REG_0, sizeof(void *));
}
- mark_reg_known_zero(env, regs, BPF_REG_0);
- regs[BPF_REG_0].btf = desc_btf;
- regs[BPF_REG_0].type = PTR_TO_BTF_ID;
- regs[BPF_REG_0].btf_id = ptr_type_id;
+
if (btf_kfunc_id_set_contains(desc_btf, resolve_prog_type(env->prog),
BTF_KFUNC_TYPE_RET_NULL, func_id)) {
regs[BPF_REG_0].type |= PTR_MAYBE_NULL;
/* For mark_ptr_or_null_reg, see 93c230e3f5bd6 */
regs[BPF_REG_0].id = ++env->id_gen;
}
- mark_btf_func_reg_size(env, BPF_REG_0, sizeof(void *));
+
if (acq) {
int id = acquire_reference_state(env, insn_idx);
@@ -7061,8 +7096,9 @@ static int check_kfunc_call(struct bpf_verifier_env *env, struct bpf_insn *insn,
}
} /* else { add_kfunc_call() ensures it is btf_type_is_void(t) } */
- nargs = btf_type_vlen(func_proto);
- args = (const struct btf_param *)(func_proto + 1);
+ for (i = 1 ; i < CALLER_SAVED_REGS; i++)
+ mark_reg_not_init(env, regs, caller_saved[i]);
+
for (i = 0; i < nargs; i++) {
u32 regno = i + 1;
--
2.35.1
Everything should be available in the selftest part of the tree, but
providing an example without uhid and hidraw will be more easy to
follow for users.
Signed-off-by: Benjamin Tissoires <[email protected]>
---
changes in v4:
- dropped the not-yet-implemented rdesc_fixup
- use the new API
changes in v3:
- use the new hid_get_data API
- add a comment for the report descriptor fixup to explain what is done
changes in v2:
- split the series by bpf/libbpf/hid/selftests and samples
---
samples/bpf/.gitignore | 1 +
samples/bpf/Makefile | 23 +++++++
samples/bpf/hid_mouse.bpf.c | 59 ++++++++++++++++
samples/bpf/hid_mouse.c | 131 ++++++++++++++++++++++++++++++++++++
4 files changed, 214 insertions(+)
create mode 100644 samples/bpf/hid_mouse.bpf.c
create mode 100644 samples/bpf/hid_mouse.c
diff --git a/samples/bpf/.gitignore b/samples/bpf/.gitignore
index 0e7bfdbff80a..65440bd618b2 100644
--- a/samples/bpf/.gitignore
+++ b/samples/bpf/.gitignore
@@ -2,6 +2,7 @@
cpustat
fds_example
hbm
+hid_mouse
ibumad
lathist
lwt_len_hist
diff --git a/samples/bpf/Makefile b/samples/bpf/Makefile
index 8fff5ad3444b..dffca4b04a94 100644
--- a/samples/bpf/Makefile
+++ b/samples/bpf/Makefile
@@ -60,6 +60,8 @@ tprogs-y += xdp_redirect_map
tprogs-y += xdp_redirect
tprogs-y += xdp_monitor
+tprogs-y += hid_mouse
+
# Libbpf dependencies
LIBBPF_SRC = $(TOOLS_PATH)/lib/bpf
LIBBPF_OUTPUT = $(abspath $(BPF_SAMPLES_PATH))/libbpf
@@ -125,6 +127,8 @@ xdp_redirect-objs := xdp_redirect_user.o $(XDP_SAMPLE)
xdp_monitor-objs := xdp_monitor_user.o $(XDP_SAMPLE)
xdp_router_ipv4-objs := xdp_router_ipv4_user.o $(XDP_SAMPLE)
+hid_mouse-objs := hid_mouse.o
+
# Tell kbuild to always build the programs
always-y := $(tprogs-y)
always-y += sockex1_kern.o
@@ -349,6 +353,8 @@ $(obj)/hbm_out_kern.o: $(src)/hbm.h $(src)/hbm_kern.h
$(obj)/hbm.o: $(src)/hbm.h
$(obj)/hbm_edt_kern.o: $(src)/hbm.h $(src)/hbm_kern.h
+$(obj)/hid_mouse.o: $(obj)/hid_mouse.skel.h
+
# Override includes for xdp_sample_user.o because $(srctree)/usr/include in
# TPROGS_CFLAGS causes conflicts
XDP_SAMPLE_CFLAGS += -Wall -O2 \
@@ -434,6 +440,23 @@ $(BPF_SKELS_LINKED): $(BPF_OBJS_LINKED) $(BPFTOOL)
@echo " BPF GEN-SKEL" $(@:.skel.h=)
$(Q)$(BPFTOOL) gen skeleton $(@:.skel.h=.lbpf.o) name $(notdir $(@:.skel.h=)) > $@
+# Generate BPF skeletons for non XDP progs
+OTHER_BPF_SKELS := hid_mouse.skel.h
+
+hid_mouse.skel.h-deps := hid_mouse.bpf.o
+
+OTHER_BPF_SRCS_LINKED := $(patsubst %.skel.h,%.bpf.c, $(OTHER_BPF_SKELS))
+OTHER_BPF_OBJS_LINKED := $(patsubst %.bpf.c,$(obj)/%.bpf.o, $(OTHER_BPF_SRCS_LINKED))
+OTHER_BPF_SKELS_LINKED := $(addprefix $(obj)/,$(OTHER_BPF_SKELS))
+
+$(OTHER_BPF_SKELS_LINKED): $(OTHER_BPF_OBJS_LINKED) $(BPFTOOL)
+ @echo " BPF GEN-OBJ " $(@:.skel.h=)
+ $(Q)$(BPFTOOL) gen object $(@:.skel.h=.lbpf.o) $(addprefix $(obj)/,$($(@F)-deps))
+ @echo " BPF GEN-SKEL" $(@:.skel.h=)
+ $(Q)$(BPFTOOL) gen skeleton $(@:.skel.h=.lbpf.o) name $(notdir $(@:.skel.h=_lskel)) > $@
+# $(call msg,GEN-SKEL,$@)
+# $(Q)$(BPFTOOL) gen skeleton $< > $@
+
# asm/sysreg.h - inline assembly used by it is incompatible with llvm.
# But, there is no easy way to fix it, so just exclude it since it is
# useless for BPF samples.
diff --git a/samples/bpf/hid_mouse.bpf.c b/samples/bpf/hid_mouse.bpf.c
new file mode 100644
index 000000000000..45cc26b70f5f
--- /dev/null
+++ b/samples/bpf/hid_mouse.bpf.c
@@ -0,0 +1,59 @@
+// SPDX-License-Identifier: GPL-2.0
+
+/*
+ * Note: for the following code to compile, we need HID to be included
+ * in vmlinuz (CONFIG_HID=y).
+ * If HID is compiled as a separate module, we need to use the vmlinux.h
+ * which contains the various hid symbols, it can be generated through:
+ *
+ * $> ./tools/bpf/bpftool/bpftool btf dump \
+ * file /sys/kernel/btf/hid format c > samples/bpf/vmlinux.h
+ *
+ * Once the code is compiled, the fact that HID is a separate module
+ * or not is not an issue, the same binary will run similarily.
+ */
+#include "vmlinux.h"
+#include <bpf/bpf_helpers.h>
+#include <bpf/bpf_tracing.h>
+
+extern __u8 *hid_bpf_kfunc_get_data(struct hid_bpf_ctx *ctx,
+ unsigned int offset,
+ const size_t __sz) __ksym;
+extern void hid_bpf_kfunc_data_release(__u8 *data) __ksym;
+extern int hid_bpf_kfunc_hw_request(struct hid_bpf_ctx *ctx) __ksym;
+
+#define BUS_USB 3
+
+SEC("fentry/hid_bpf_device_event")
+int BPF_KPROBE(hid_event, struct hid_bpf_ctx *hctx, __s64 type)
+{
+ __u8 *rw_data = hid_bpf_kfunc_get_data(hctx, 0, 5);
+
+ if (!rw_data)
+ return 0;
+
+ if (hctx->bus == BUS_USB) {
+ /* note: the following call prevents the program to be loaded:
+ * hid_bpf_device_event() is not sleepable when this function is.
+ */
+ hid_bpf_kfunc_hw_request(hctx);
+
+ bpf_printk("data: %02x %02x %02x", rw_data[0], rw_data[1], rw_data[4]);
+ }
+
+ hid_bpf_kfunc_data_release(rw_data);
+
+ return 0;
+}
+
+SEC("fmod_ret/hid_bpf_device_event")
+int BPF_PROG(hid_mod_event, struct hid_bpf_ctx *hctx, __s64 type)
+{
+ /* prevent any USB event to go through the HID stack */
+ if (hctx->bus == BUS_USB)
+ return -1;
+
+ return 0;
+}
+
+char _license[] SEC("license") = "GPL";
diff --git a/samples/bpf/hid_mouse.c b/samples/bpf/hid_mouse.c
new file mode 100644
index 000000000000..aace0b959d3b
--- /dev/null
+++ b/samples/bpf/hid_mouse.c
@@ -0,0 +1,131 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/* Copyright (c) 2022 Benjamin Tissoires
+ */
+
+/* not sure why but this doesn't get preoperly imported */
+#define __must_check
+
+#include <assert.h>
+#include <errno.h>
+#include <fcntl.h>
+#include <libgen.h>
+#include <signal.h>
+#include <stdbool.h>
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+#include <sys/resource.h>
+#include <unistd.h>
+
+#include <linux/bpf.h>
+#include <linux/err.h>
+
+#include "bpf_util.h"
+#include <bpf/bpf.h>
+#include <bpf/libbpf.h>
+
+#include "hid_mouse.skel.h"
+
+static char *sysfs_path;
+static int sysfs_fd;
+static int prog_count;
+
+static bool running = true;
+
+struct prog {
+ int fd;
+ struct bpf_link *link;
+ enum bpf_attach_type type;
+};
+
+static struct prog progs[10];
+
+static void int_exit(int sig)
+{
+ running = false;
+ exit(0);
+}
+
+static void usage(const char *prog)
+{
+ fprintf(stderr,
+ "%s: %s /sys/bus/hid/devices/0BUS:0VID:0PID:00ID/uevent\n\n",
+ __func__, prog);
+}
+
+int main(int argc, char **argv)
+{
+ struct hid_mouse_lskel *skel;
+ int prog_fd, err;
+ const char *optstr = "";
+ int opt;
+ char filename[256];
+
+ LIBBPF_OPTS(bpf_test_run_opts, topts,
+ .repeat = 1,
+ );
+
+ while ((opt = getopt(argc, argv, optstr)) != -1) {
+ switch (opt) {
+ default:
+ usage(basename(argv[0]));
+ return 1;
+ }
+ }
+
+ if (optind == argc) {
+ usage(basename(argv[0]));
+ return 1;
+ }
+
+ sysfs_path = argv[optind];
+ if (!sysfs_path) {
+ perror("sysfs");
+ return 1;
+ }
+
+ skel = hid_mouse_lskel__open_and_load();
+ if (!skel) {
+ fprintf(stderr, "%s %s:%d", __func__, __FILE__, __LINE__);
+ return -1;
+ }
+
+ err = hid_mouse_lskel__attach(skel);
+ if (err)
+ goto cleanup;
+
+ //prog_fd = bpf_program__fd(skel->progs.hid_event);
+ //err = bpf_prog_test_run_opts(prog_fd, &topts);
+
+// sysfs_fd = open(sysfs_path, O_RDONLY);
+//
+//// bpf_object__for_each_program(prog, obj) {
+//// progs[prog_count].fd = bpf_program__fd(prog);
+//// progs[prog_count].type = bpf_program__get_expected_attach_type(prog);
+//// progs[prog_count].link = bpf_program__attach(prog);
+//// if (libbpf_get_error(progs[prog_count].link)) {
+//// fprintf(stderr, "bpf_prog_attach: err=%m\n");
+//// progs[prog_count].fd = 0;
+//// progs[prog_count].link = NULL;
+//// goto cleanup;
+//// }
+//// prog_count++;
+//// }
+//
+ signal(SIGINT, int_exit);
+ signal(SIGTERM, int_exit);
+//
+//// err = bpf_obj_get_info_by_fd(progs[0].fd, &info, &info_len);
+//// if (err) {
+//// printf("can't get prog info - %s\n", strerror(errno));
+//// goto cleanup;
+//// }
+//
+ while (running)
+ ;
+
+ cleanup:
+ hid_mouse_lskel__destroy(skel);
+
+ return err;
+}
--
2.35.1
When using error-injection function through bpf to change the return
code, we need to know if the function is sleepable or not.
Currently the code assumes that all error-inject functions are sleepable,
except for a few selected of them, hardcoded in kernel/bpf/verifier.c
Add a new flag to error-inject so we can code that information where the
function is declared.
Signed-off-by: Benjamin Tissoires <[email protected]>
---
new in v4:
- another approach would be to define a new kfunc_set, and register
it with btf. But in that case, what program type would we use?
BPF_PROG_TYPE_UNSPEC?
- also note that maybe we should consider all of the functions
non-sleepable and only mark some as sleepable. IMO it makes more
sense to be more restrictive by default.
---
include/asm-generic/error-injection.h | 1 +
kernel/bpf/verifier.c | 10 ++++++++--
lib/error-inject.c | 2 ++
3 files changed, 11 insertions(+), 2 deletions(-)
diff --git a/include/asm-generic/error-injection.h b/include/asm-generic/error-injection.h
index fbca56bd9cbc..5974942353a6 100644
--- a/include/asm-generic/error-injection.h
+++ b/include/asm-generic/error-injection.h
@@ -9,6 +9,7 @@ enum {
EI_ETYPE_ERRNO, /* Return -ERRNO if failure */
EI_ETYPE_ERRNO_NULL, /* Return -ERRNO or NULL if failure */
EI_ETYPE_TRUE, /* Return true if failure */
+ EI_ETYPE_NS_ERRNO, /* Return -ERRNO if failure and tag the function as non-sleepable */
};
struct error_injection_entry {
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 0f339f9058f3..45c8feea6478 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -14085,6 +14085,11 @@ static int check_non_sleepable_error_inject(u32 btf_id)
return btf_id_set_contains(&btf_non_sleepable_error_inject, btf_id);
}
+static int is_non_sleepable_error_inject(unsigned long addr)
+{
+ return get_injectable_error_type(addr) == EI_ETYPE_NS_ERRNO;
+}
+
int bpf_check_attach_target(struct bpf_verifier_log *log,
const struct bpf_prog *prog,
const struct bpf_prog *tgt_prog,
@@ -14281,8 +14286,9 @@ int bpf_check_attach_target(struct bpf_verifier_log *log,
/* fentry/fexit/fmod_ret progs can be sleepable only if they are
* attached to ALLOW_ERROR_INJECTION and are not in denylist.
*/
- if (!check_non_sleepable_error_inject(btf_id) &&
- within_error_injection_list(addr))
+ if (within_error_injection_list(addr) &&
+ !check_non_sleepable_error_inject(btf_id) &&
+ !is_non_sleepable_error_inject(addr))
ret = 0;
break;
case BPF_PROG_TYPE_LSM:
diff --git a/lib/error-inject.c b/lib/error-inject.c
index 2ff5ef689d72..560c3b18f439 100644
--- a/lib/error-inject.c
+++ b/lib/error-inject.c
@@ -183,6 +183,8 @@ static const char *error_type_string(int etype)
return "ERRNO_NULL";
case EI_ETYPE_TRUE:
return "TRUE";
+ case EI_ETYPE_NS_ERRNO:
+ return "NS_ERRNO";
default:
return "(unknown)";
}
--
2.35.1
using fentry/fexit and fmod_ret is very convenient to access
module BPF capabilities, so be able to define kfuncs there is
a must.
Signed-off-by: Benjamin Tissoires <[email protected]>
---
new in v4:
- I think this is where I need to add my new kfuncs, though
in the end I need to be able to change the incoming data, so
maybe only fmod_ret is the one we need to be able to be RW.
---
kernel/bpf/btf.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c
index 0493310d981f..76318a4c2d0e 100644
--- a/kernel/bpf/btf.c
+++ b/kernel/bpf/btf.c
@@ -202,6 +202,7 @@ enum btf_kfunc_hook {
BTF_KFUNC_HOOK_XDP,
BTF_KFUNC_HOOK_TC,
BTF_KFUNC_HOOK_STRUCT_OPS,
+ BTF_KFUNC_HOOK_TRACING,
BTF_KFUNC_HOOK_MAX,
};
@@ -6892,6 +6893,8 @@ static int bpf_prog_type_to_kfunc_hook(enum bpf_prog_type prog_type)
return BTF_KFUNC_HOOK_TC;
case BPF_PROG_TYPE_STRUCT_OPS:
return BTF_KFUNC_HOOK_STRUCT_OPS;
+ case BPF_PROG_TYPE_TRACING:
+ return BTF_KFUNC_HOOK_TRACING;
default:
return BTF_KFUNC_HOOK_MAX;
}
--
2.35.1
Declare an entry point that can use fmod_ret BPF programs, and
also an API to access and change the incoming data.
Signed-off-by: Benjamin Tissoires <[email protected]>
---
new-ish in v4:
- far from complete, but gives an overview of what we can do now.
---
drivers/hid/hid-core.c | 115 ++++++++++++++++++++++++++++++++++++++++
include/linux/hid_bpf.h | 29 ++++++++++
2 files changed, 144 insertions(+)
create mode 100644 include/linux/hid_bpf.h
diff --git a/drivers/hid/hid-core.c b/drivers/hid/hid-core.c
index db925794fbe6..ff4e1b47d2fb 100644
--- a/drivers/hid/hid-core.c
+++ b/drivers/hid/hid-core.c
@@ -32,6 +32,9 @@
#include <linux/hiddev.h>
#include <linux/hid-debug.h>
#include <linux/hidraw.h>
+#include <linux/btf.h>
+#include <linux/btf_ids.h>
+#include <linux/hid_bpf.h>
#include "hid-ids.h"
@@ -2008,6 +2011,99 @@ int hid_report_raw_event(struct hid_device *hid, int type, u8 *data, u32 size,
}
EXPORT_SYMBOL_GPL(hid_report_raw_event);
+struct hid_bpf_ctx_kern {
+ struct hid_device *hdev;
+ struct hid_bpf_ctx ctx;
+ u8 *data;
+ size_t size;
+};
+
+__weak int hid_bpf_device_event(struct hid_bpf_ctx *ctx, s64 type)
+{
+ return 0;
+}
+ALLOW_ERROR_INJECTION(hid_bpf_device_event, NS_ERRNO);
+
+noinline __u8 *
+hid_bpf_kfunc_get_data(struct hid_bpf_ctx *ctx, unsigned int offset, const size_t __sz)
+{
+ struct hid_bpf_ctx_kern *ctx_kern;
+
+ if (!ctx)
+ return NULL;
+
+ ctx_kern = container_of(ctx, struct hid_bpf_ctx_kern, ctx);
+
+ return ctx_kern->data;
+}
+
+noinline void
+hid_bpf_kfunc_data_release(__u8 *data)
+{
+}
+
+noinline int
+hid_bpf_kfunc_hw_request(struct hid_bpf_ctx *ctx)
+{
+ if (!ctx)
+ return -EINVAL;
+
+ pr_err("%s test ctx->bus: %04x %s:%d", __func__, ctx->bus, __FILE__, __LINE__);
+
+ return 0;
+}
+
+/*
+ * The following set contains all functions we agree BPF programs
+ * can use.
+ */
+BTF_SET_START(hid_bpf_kfunc_ids)
+BTF_ID(func, hid_bpf_kfunc_get_data)
+BTF_ID(func, hid_bpf_kfunc_data_release)
+BTF_ID(func, hid_bpf_kfunc_hw_request)
+BTF_SET_END(hid_bpf_kfunc_ids)
+
+/*
+ * The following set contains all functions to provide a kernel
+ * resource to the BPF program.
+ * We need to add a counterpart release function.
+ */
+BTF_SET_START(hid_bpf_kfunc_acquire_ids)
+BTF_ID(func, hid_bpf_kfunc_get_data)
+BTF_SET_END(hid_bpf_kfunc_acquire_ids)
+
+/*
+ * The following set is the release counterpart of the previous
+ * function set.
+ */
+BTF_SET_START(hid_bpf_kfunc_release_ids)
+BTF_ID(func, hid_bpf_kfunc_data_release)
+BTF_SET_END(hid_bpf_kfunc_release_ids)
+
+/*
+ * The following set tells which functions are sleepable.
+ */
+BTF_SET_START(hid_bpf_kfunc_sleepable_ids)
+BTF_ID(func, hid_bpf_kfunc_hw_request)
+BTF_SET_END(hid_bpf_kfunc_sleepable_ids)
+
+static const struct btf_kfunc_id_set hid_bpf_kfunc_set = {
+ .owner = THIS_MODULE,
+ .check_set = &hid_bpf_kfunc_ids,
+ .acquire_set = &hid_bpf_kfunc_acquire_ids,
+ .release_set = &hid_bpf_kfunc_release_ids,
+ .ret_null_set = &hid_bpf_kfunc_acquire_ids, /* duplicated to force BPF programs to
+ * check the validity of the returned pointer
+ * in acquire function
+ */
+ .sleepable_set = &hid_bpf_kfunc_sleepable_ids,
+};
+
+static int __init hid_bpf_init(void)
+{
+ return register_btf_kfunc_id_set(BPF_PROG_TYPE_TRACING, &hid_bpf_kfunc_set);
+}
+
/**
* hid_input_report - report data from lower layer (usb, bt...)
*
@@ -2025,6 +2121,17 @@ int hid_input_report(struct hid_device *hid, int type, u8 *data, u32 size, int i
struct hid_driver *hdrv;
struct hid_report *report;
int ret = 0;
+ struct hid_bpf_ctx_kern ctx_kern = {
+ .hdev = hid,
+ .ctx = {
+ .bus = hid->bus,
+ .group = hid->group,
+ .vendor = hid->vendor,
+ .product = hid->product,
+ },
+ .data = data,
+ .size = size,
+ };
if (!hid)
return -ENODEV;
@@ -2039,6 +2146,10 @@ int hid_input_report(struct hid_device *hid, int type, u8 *data, u32 size, int i
report_enum = hid->report_enum + type;
hdrv = hid->driver;
+ ret = hid_bpf_device_event(&ctx_kern.ctx, type);
+ if (ret)
+ goto unlock;
+
if (!size) {
dbg_hid("empty report\n");
ret = -1;
@@ -2914,6 +3025,10 @@ static int __init hid_init(void)
hid_debug_init();
+ ret = hid_bpf_init();
+ if (ret)
+ pr_err("%s error in bpf_init: %d %s:%d", __func__, ret, __FILE__, __LINE__);
+
return 0;
err_bus:
bus_unregister(&hid_bus_type);
diff --git a/include/linux/hid_bpf.h b/include/linux/hid_bpf.h
new file mode 100644
index 000000000000..100909b27cd8
--- /dev/null
+++ b/include/linux/hid_bpf.h
@@ -0,0 +1,29 @@
+/* SPDX-License-Identifier: GPL-2.0+ WITH Linux-syscall-note */
+
+#ifndef __HID_BPF_H
+#define __HID_BPF_H
+
+/*
+ * The following is the HID BPF API.
+ *
+ * It should be treated as UAPI, so extra care is required
+ * when making change to this file.
+ */
+
+struct hid_bpf_ctx {
+ __u16 bus; /* BUS ID */
+ __u16 group; /* Report group */
+ __u32 vendor; /* Vendor ID */
+ __u32 product; /* Product ID */
+ __u32 version; /* HID version */
+};
+
+/* Following functions are tracepoints that BPF programs can attach to */
+int hid_bpf_device_event(struct hid_bpf_ctx *ctx, __s64 type);
+
+/* Following functions are kfunc that we export to BPF programs */
+__u8 *hid_bpf_kfunc_get_data(struct hid_bpf_ctx *ctx, unsigned int offset, const size_t __sz);
+void hid_bpf_kfunc_data_release(__u8 *data);
+int hid_bpf_kfunc_hw_request(struct hid_bpf_ctx *ctx);
+
+#endif /* __HID_BPF_H */
--
2.35.1
The test is pretty basic:
- create a virtual uhid device that no userspace will like (to not mess
up the running system)
- attach a BPF prog to it
- open the matching hidraw node
- inject one event and check:
* that the BPF program can do something on the event stream
* can modify the event stream
Signed-off-by: Benjamin Tissoires <[email protected]>
---
changes in v4:
- manually retrieve the hidraw node from the sysfs (we can't get it for
free from BPF)
- use the new API
changes in v3:
- squashed "hid: rely on uhid event to know if a test device is ready"
into this one
- add selftests bpf VM config changes
- s/hidraw_ino/hidraw_number/
changes in v2:
- split the series by bpf/libbpf/hid/selftests and samples
---
tools/testing/selftests/bpf/config | 3 +
tools/testing/selftests/bpf/prog_tests/hid.c | 482 +++++++++++++++++++
tools/testing/selftests/bpf/progs/hid.c | 32 ++
3 files changed, 517 insertions(+)
create mode 100644 tools/testing/selftests/bpf/prog_tests/hid.c
create mode 100644 tools/testing/selftests/bpf/progs/hid.c
diff --git a/tools/testing/selftests/bpf/config b/tools/testing/selftests/bpf/config
index 763db63a3890..b983c76535f8 100644
--- a/tools/testing/selftests/bpf/config
+++ b/tools/testing/selftests/bpf/config
@@ -53,3 +53,6 @@ CONFIG_NF_DEFRAG_IPV4=y
CONFIG_NF_DEFRAG_IPV6=y
CONFIG_NF_CONNTRACK=y
CONFIG_USERFAULTFD=y
+CONFIG_HID=y
+CONFIG_HIDRAW=y
+CONFIG_UHID=y
diff --git a/tools/testing/selftests/bpf/prog_tests/hid.c b/tools/testing/selftests/bpf/prog_tests/hid.c
new file mode 100644
index 000000000000..624853fb0f7d
--- /dev/null
+++ b/tools/testing/selftests/bpf/prog_tests/hid.c
@@ -0,0 +1,482 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright (c) 2022 Red Hat */
+#include <test_progs.h>
+#include <testing_helpers.h>
+#include "hid.skel.h"
+
+#include <fcntl.h>
+#include <fnmatch.h>
+#include <dirent.h>
+#include <poll.h>
+#include <stdbool.h>
+#include <linux/uhid.h>
+
+static unsigned char rdesc[] = {
+ 0x06, 0x00, 0xff, /* Usage Page (Vendor Defined Page 1) */
+ 0x09, 0x21, /* Usage (Vendor Usage 0x21) */
+ 0xa1, 0x01, /* COLLECTION (Application) */
+ 0x09, 0x01, /* Usage (Vendor Usage 0x01) */
+ 0xa1, 0x00, /* COLLECTION (Physical) */
+ 0x85, 0x01, /* REPORT_ID (1) */
+ 0x06, 0x00, 0xff, /* Usage Page (Vendor Defined Page 1) */
+ 0x19, 0x01, /* USAGE_MINIMUM (1) */
+ 0x29, 0x03, /* USAGE_MAXIMUM (3) */
+ 0x15, 0x00, /* LOGICAL_MINIMUM (0) */
+ 0x25, 0x01, /* LOGICAL_MAXIMUM (1) */
+ 0x95, 0x03, /* REPORT_COUNT (3) */
+ 0x75, 0x01, /* REPORT_SIZE (1) */
+ 0x81, 0x02, /* INPUT (Data,Var,Abs) */
+ 0x95, 0x01, /* REPORT_COUNT (1) */
+ 0x75, 0x05, /* REPORT_SIZE (5) */
+ 0x81, 0x01, /* INPUT (Cnst,Var,Abs) */
+ 0x05, 0x01, /* USAGE_PAGE (Generic Desktop) */
+ 0x09, 0x30, /* USAGE (X) */
+ 0x09, 0x31, /* USAGE (Y) */
+ 0x15, 0x81, /* LOGICAL_MINIMUM (-127) */
+ 0x25, 0x7f, /* LOGICAL_MAXIMUM (127) */
+ 0x75, 0x10, /* REPORT_SIZE (16) */
+ 0x95, 0x02, /* REPORT_COUNT (2) */
+ 0x81, 0x06, /* INPUT (Data,Var,Rel) */
+
+ 0x06, 0x00, 0xff, /* Usage Page (Vendor Defined Page 1) */
+ 0x19, 0x01, /* USAGE_MINIMUM (1) */
+ 0x29, 0x03, /* USAGE_MAXIMUM (3) */
+ 0x15, 0x00, /* LOGICAL_MINIMUM (0) */
+ 0x25, 0x01, /* LOGICAL_MAXIMUM (1) */
+ 0x95, 0x03, /* REPORT_COUNT (3) */
+ 0x75, 0x01, /* REPORT_SIZE (1) */
+ 0x91, 0x02, /* Output (Data,Var,Abs) */
+ 0x95, 0x01, /* REPORT_COUNT (1) */
+ 0x75, 0x05, /* REPORT_SIZE (5) */
+ 0x91, 0x01, /* Output (Cnst,Var,Abs) */
+
+ 0x06, 0x00, 0xff, /* Usage Page (Vendor Defined Page 1) */
+ 0x19, 0x06, /* USAGE_MINIMUM (6) */
+ 0x29, 0x08, /* USAGE_MAXIMUM (8) */
+ 0x15, 0x00, /* LOGICAL_MINIMUM (0) */
+ 0x25, 0x01, /* LOGICAL_MAXIMUM (1) */
+ 0x95, 0x03, /* REPORT_COUNT (3) */
+ 0x75, 0x01, /* REPORT_SIZE (1) */
+ 0xb1, 0x02, /* Feature (Data,Var,Abs) */
+ 0x95, 0x01, /* REPORT_COUNT (1) */
+ 0x75, 0x05, /* REPORT_SIZE (5) */
+ 0x91, 0x01, /* Output (Cnst,Var,Abs) */
+
+ 0xc0, /* END_COLLECTION */
+ 0xc0, /* END_COLLECTION */
+};
+
+static pthread_mutex_t uhid_started_mtx = PTHREAD_MUTEX_INITIALIZER;
+static pthread_cond_t uhid_started = PTHREAD_COND_INITIALIZER;
+
+/* no need to protect uhid_stopped, only one thread accesses it */
+static bool uhid_stopped;
+
+static int uhid_write(int fd, const struct uhid_event *ev)
+{
+ ssize_t ret;
+
+ ret = write(fd, ev, sizeof(*ev));
+ if (ret < 0) {
+ fprintf(stderr, "Cannot write to uhid: %m\n");
+ return -errno;
+ } else if (ret != sizeof(*ev)) {
+ fprintf(stderr, "Wrong size written to uhid: %zd != %zu\n",
+ ret, sizeof(ev));
+ return -EFAULT;
+ } else {
+ return 0;
+ }
+}
+
+static int create(int fd, int rand_nb)
+{
+ struct uhid_event ev;
+ char buf[25];
+
+ sprintf(buf, "test-uhid-device-%d", rand_nb);
+
+ memset(&ev, 0, sizeof(ev));
+ ev.type = UHID_CREATE;
+ strcpy((char *)ev.u.create.name, buf);
+ ev.u.create.rd_data = rdesc;
+ ev.u.create.rd_size = sizeof(rdesc);
+ ev.u.create.bus = BUS_USB;
+ ev.u.create.vendor = 0x0001;
+ ev.u.create.product = 0x0a37;
+ ev.u.create.version = 0;
+ ev.u.create.country = 0;
+
+ sprintf(buf, "%d", rand_nb);
+ strcpy((char *)ev.u.create.phys, buf);
+
+ return uhid_write(fd, &ev);
+}
+
+static void destroy(int fd)
+{
+ struct uhid_event ev;
+
+ memset(&ev, 0, sizeof(ev));
+ ev.type = UHID_DESTROY;
+
+ uhid_write(fd, &ev);
+}
+
+static int event(int fd)
+{
+ struct uhid_event ev;
+ ssize_t ret;
+
+ memset(&ev, 0, sizeof(ev));
+ ret = read(fd, &ev, sizeof(ev));
+ if (ret == 0) {
+ fprintf(stderr, "Read HUP on uhid-cdev\n");
+ return -EFAULT;
+ } else if (ret < 0) {
+ fprintf(stderr, "Cannot read uhid-cdev: %m\n");
+ return -errno;
+ } else if (ret != sizeof(ev)) {
+ fprintf(stderr, "Invalid size read from uhid-dev: %zd != %zu\n",
+ ret, sizeof(ev));
+ return -EFAULT;
+ }
+
+ switch (ev.type) {
+ case UHID_START:
+ pthread_mutex_lock(&uhid_started_mtx);
+ pthread_cond_signal(&uhid_started);
+ pthread_mutex_unlock(&uhid_started_mtx);
+
+ fprintf(stderr, "UHID_START from uhid-dev\n");
+ break;
+ case UHID_STOP:
+ uhid_stopped = true;
+
+ fprintf(stderr, "UHID_STOP from uhid-dev\n");
+ break;
+ case UHID_OPEN:
+ fprintf(stderr, "UHID_OPEN from uhid-dev\n");
+ break;
+ case UHID_CLOSE:
+ fprintf(stderr, "UHID_CLOSE from uhid-dev\n");
+ break;
+ case UHID_OUTPUT:
+ fprintf(stderr, "UHID_OUTPUT from uhid-dev\n");
+ break;
+ case UHID_GET_REPORT:
+ fprintf(stderr, "UHID_GET_REPORT from uhid-dev\n");
+ break;
+ case UHID_SET_REPORT:
+ fprintf(stderr, "UHID_SET_REPORT from uhid-dev\n");
+ break;
+ default:
+ fprintf(stderr, "Invalid event from uhid-dev: %u\n", ev.type);
+ }
+
+ return 0;
+}
+
+static void *read_uhid_events_thread(void *arg)
+{
+ int fd = *(int *)arg;
+ struct pollfd pfds[1];
+ int ret = 0;
+
+ pfds[0].fd = fd;
+ pfds[0].events = POLLIN;
+
+ uhid_stopped = false;
+
+ while (!uhid_stopped) {
+ ret = poll(pfds, 1, 100);
+ if (ret < 0) {
+ fprintf(stderr, "Cannot poll for fds: %m\n");
+ break;
+ }
+ if (pfds[0].revents & POLLIN) {
+ ret = event(fd);
+ if (ret)
+ break;
+ }
+ }
+
+ return (void *)(long)ret;
+}
+
+static int uhid_start_listener(pthread_t *tid, int uhid_fd)
+{
+ int fd = uhid_fd;
+
+ pthread_mutex_lock(&uhid_started_mtx);
+ if (CHECK_FAIL(pthread_create(tid, NULL, read_uhid_events_thread,
+ (void *)&fd))) {
+ pthread_mutex_unlock(&uhid_started_mtx);
+ close(fd);
+ return -EIO;
+ }
+ pthread_cond_wait(&uhid_started, &uhid_started_mtx);
+ pthread_mutex_unlock(&uhid_started_mtx);
+
+ return 0;
+}
+
+static int send_event(int fd, u8 *buf, size_t size)
+{
+ struct uhid_event ev;
+
+ if (size > sizeof(ev.u.input.data))
+ return -E2BIG;
+
+ memset(&ev, 0, sizeof(ev));
+ ev.type = UHID_INPUT2;
+ ev.u.input2.size = size;
+
+ memcpy(ev.u.input2.data, buf, size);
+
+ return uhid_write(fd, &ev);
+}
+
+static int setup_uhid(int rand_nb)
+{
+ int fd;
+ const char *path = "/dev/uhid";
+ int ret;
+
+ fd = open(path, O_RDWR | O_CLOEXEC);
+ if (!ASSERT_GE(fd, 0, "open uhid-cdev"))
+ return -EPERM;
+
+ ret = create(fd, rand_nb);
+ if (!ASSERT_OK(ret, "create uhid device")) {
+ close(fd);
+ return -EPERM;
+ }
+
+ return fd;
+}
+
+static bool match_sysfs_device(int dev_id, const char *workdir, struct dirent *dir)
+{
+ const char *target = "0003:0001:0A37.*";
+ char phys[512];
+ char uevent[1024];
+ char temp[512];
+ int fd, nread;
+ bool found = false;
+
+ if (fnmatch(target, dir->d_name, 0))
+ return false;
+
+ /* we found the correct VID/PID, now check for phys */
+ sprintf(uevent, "%s/%s/uevent", workdir, dir->d_name);
+
+ fd = open(uevent, O_RDONLY | O_NONBLOCK);
+ if (fd < 0)
+ return false;
+
+ sprintf(phys, "PHYS=%d", dev_id);
+
+ nread = read(fd, temp, ARRAY_SIZE(temp));
+ if (nread > 0 && (strstr(temp, phys)) != NULL)
+ found = true;
+
+ close(fd);
+
+ return found;
+}
+
+static int get_sysfs_fd(int dev_id)
+{
+ const char *workdir = "/sys/devices/virtual/misc/uhid";
+ char uevent[1024];
+ DIR *d;
+ struct dirent *dir;
+ int found = -1;
+
+ /* it would be nice to be able to use nftw, but the no_alu32 target doesn't support it */
+
+ d = opendir(workdir);
+ if (d) {
+ while ((dir = readdir(d)) != NULL) {
+ if (!match_sysfs_device(dev_id, workdir, dir))
+ continue;
+
+ sprintf(uevent, "%s/%s/uevent", workdir, dir->d_name);
+
+ found = open(uevent, O_RDONLY | O_NONBLOCK);
+ if (found > 0)
+ break;
+ }
+ closedir(d);
+ }
+
+ return found;
+}
+
+static int get_hidraw(int dev_id)
+{
+ const char *workdir = "/sys/devices/virtual/misc/uhid";
+ char sysfs[1024];
+ DIR *d, *subd;
+ struct dirent *dir, *subdir;
+ int i, found = -1;
+
+ /* retry 5 times in case the system is loaded */
+ for (i = 5; i > 0; i--) {
+ usleep(10);
+ d = opendir(workdir);
+
+ if (!d)
+ continue;
+
+ while ((dir = readdir(d)) != NULL) {
+ if (!match_sysfs_device(dev_id, workdir, dir))
+ continue;
+
+ sprintf(sysfs, "%s/%s/hidraw", workdir, dir->d_name);
+
+ subd = opendir(sysfs);
+ if (!subd)
+ continue;
+
+ while ((subdir = readdir(subd)) != NULL) {
+ if (fnmatch("hidraw*", subdir->d_name, 0))
+ continue;
+
+ found = atoi(subdir->d_name + strlen("hidraw"));
+ }
+
+ closedir(subd);
+
+ if (found > 0)
+ break;
+ }
+ closedir(d);
+ }
+
+ return found;
+}
+
+/*
+ * Attach hid_first_event to the given uhid device,
+ * retrieve and open the matching hidraw node,
+ * inject one event in the uhid device,
+ * check that the program sees it and can change the data
+ */
+static int test_hid_raw_event(struct hid *hid_skel, int uhid_fd, int sysfs_fd, int dev_id)
+{
+ int err, hidraw_number, hidraw_fd = -1;
+ char hidraw_path[64] = {0};
+ u8 buf[10] = {0};
+ int ret = -1;
+
+ /* check that the program is correctly loaded */
+ ASSERT_EQ(hid_skel->data->callback_check, 52, "callback_check1");
+ ASSERT_EQ(hid_skel->data->callback2_check, 52, "callback2_check1");
+
+ /* attach the first program */
+ hid_skel->links.hid_first_event =
+ bpf_program__attach(hid_skel->progs.hid_first_event);
+ if (!ASSERT_OK_PTR(hid_skel->links.hid_first_event,
+ "attach_hid(hid_first_event)"))
+ return PTR_ERR(hid_skel->links.hid_first_event);
+
+ hidraw_number = get_hidraw(dev_id);
+ if (!ASSERT_GE(hidraw_number, 0, "get_hidraw"))
+ goto cleanup;
+
+ /* open hidraw node to check the other side of the pipe */
+ sprintf(hidraw_path, "/dev/hidraw%d", hidraw_number);
+ hidraw_fd = open(hidraw_path, O_RDWR | O_NONBLOCK);
+
+ if (!ASSERT_GE(hidraw_fd, 0, "open_hidraw"))
+ goto cleanup;
+
+ /* inject one event */
+ buf[0] = 1;
+ buf[1] = 42;
+ send_event(uhid_fd, buf, 6);
+
+ /* check that hid_first_event() was executed */
+ ASSERT_EQ(hid_skel->data->callback_check, 42, "callback_check1");
+
+ /* read the data from hidraw */
+ memset(buf, 0, sizeof(buf));
+ err = read(hidraw_fd, buf, sizeof(buf));
+ if (!ASSERT_EQ(err, 6, "read_hidraw"))
+ goto cleanup;
+
+ if (!ASSERT_EQ(buf[2], 47, "hid_first_event"))
+ goto cleanup;
+
+ /* inject another event */
+ buf[0] = 1;
+ buf[1] = 47;
+ send_event(uhid_fd, buf, 6);
+
+ /* check that hid_first_event() was executed */
+ ASSERT_EQ(hid_skel->data->callback_check, 47, "callback_check1");
+
+ /* read the data from hidraw */
+ memset(buf, 0, sizeof(buf));
+ err = read(hidraw_fd, buf, sizeof(buf));
+ if (!ASSERT_EQ(err, 6, "read_hidraw"))
+ goto cleanup;
+
+ if (!ASSERT_EQ(buf[2], 52, "hid_first_event"))
+ goto cleanup;
+
+ ret = 0;
+
+cleanup:
+ if (hidraw_fd >= 0)
+ close(hidraw_fd);
+
+ hid__detach(hid_skel);
+
+ return ret;
+}
+
+void serial_test_hid_bpf(void)
+{
+ struct hid *hid_skel = NULL;
+ int err, uhid_fd, sysfs_fd;
+ void *uhid_err;
+ time_t t;
+ pthread_t tid;
+ int dev_id;
+
+ /* initialize random number generator */
+ srand((unsigned int)time(&t));
+
+ dev_id = rand() % 1024;
+
+ uhid_fd = setup_uhid(dev_id);
+ if (!ASSERT_GE(uhid_fd, 0, "setup uhid"))
+ return;
+
+ err = uhid_start_listener(&tid, uhid_fd);
+ ASSERT_OK(err, "uhid_start_listener");
+
+ /* locate the uevent file of the created device */
+ sysfs_fd = get_sysfs_fd(dev_id);
+ if (!ASSERT_GE(sysfs_fd, 0, "locate sysfs uhid device"))
+ goto cleanup;
+
+ hid_skel = hid__open_and_load();
+ if (!ASSERT_OK_PTR(hid_skel, "hid_skel_load"))
+ goto cleanup;
+
+ /* start the tests! */
+ err = test_hid_raw_event(hid_skel, uhid_fd, sysfs_fd, dev_id);
+ ASSERT_OK(err, "hid");
+
+cleanup:
+ hid__destroy(hid_skel);
+ destroy(uhid_fd);
+
+ pthread_join(tid, &uhid_err);
+ err = (int)(long)uhid_err;
+ CHECK_FAIL(err);
+}
diff --git a/tools/testing/selftests/bpf/progs/hid.c b/tools/testing/selftests/bpf/progs/hid.c
new file mode 100644
index 000000000000..37cc98ca206b
--- /dev/null
+++ b/tools/testing/selftests/bpf/progs/hid.c
@@ -0,0 +1,32 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright (c) 2022 Red hat */
+#include "vmlinux.h"
+#include <bpf/bpf_helpers.h>
+#include <bpf/bpf_tracing.h>
+
+char _license[] SEC("license") = "GPL";
+
+extern __u8 *hid_bpf_kfunc_get_data(struct hid_bpf_ctx *ctx,
+ unsigned int offset,
+ const size_t __sz) __ksym;
+extern void hid_bpf_kfunc_data_release(__u8 *data) __ksym;
+
+__u64 callback_check = 52;
+__u64 callback2_check = 52;
+
+SEC("fmod_ret/hid_bpf_device_event")
+int BPF_PROG(hid_first_event, struct hid_bpf_ctx *hid_ctx, __s64 type)
+{
+ __u8 *rw_data = hid_bpf_kfunc_get_data(hid_ctx, 0 /* offset */, 3 /* size */);
+
+ if (!rw_data)
+ return 0; /* EPERM check */
+
+ callback_check = rw_data[1];
+
+ rw_data[2] = rw_data[1] + 5;
+
+ hid_bpf_kfunc_data_release(rw_data);
+
+ return 0;
+}
--
2.35.1
On Thu, Apr 21, 2022 at 04:07:35PM +0200, Benjamin Tissoires wrote:
> When a kfunc is not returning a pointer to a struct but to a plain type,
> check if one of the arguments is called __sz and is a const from the
> caller, and use this as the size of the allocated memory.
>
> For tracing programs, we consider the provided memory to be read only
> unless the program is BPF_MODIFY_RETURN.
>
> Signed-off-by: Benjamin Tissoires <[email protected]>
>
> ---
>
> new in v4
> ---
> include/linux/btf.h | 6 ++++
> kernel/bpf/btf.c | 31 ++++++++++++++++----
> kernel/bpf/verifier.c | 66 +++++++++++++++++++++++++++++++++----------
> 3 files changed, 83 insertions(+), 20 deletions(-)
>
> diff --git a/include/linux/btf.h b/include/linux/btf.h
> index 36bc09b8e890..76a3ff48ae2a 100644
> --- a/include/linux/btf.h
> +++ b/include/linux/btf.h
> @@ -332,6 +332,12 @@ static inline struct btf_param *btf_params(const struct btf_type *t)
> return (struct btf_param *)(t + 1);
> }
>
> +struct bpf_reg_state;
> +
> +bool btf_is_kfunc_arg_mem_size(const struct btf *btf,
> + const struct btf_param *arg,
> + const struct bpf_reg_state *reg);
> +
> #ifdef CONFIG_BPF_SYSCALL
> struct bpf_prog;
>
> diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c
> index 76318a4c2d0e..22e6e3cdc7ee 100644
> --- a/kernel/bpf/btf.c
> +++ b/kernel/bpf/btf.c
> @@ -5851,9 +5851,9 @@ static bool __btf_type_is_scalar_struct(struct bpf_verifier_log *log,
> return true;
> }
>
> -static bool is_kfunc_arg_mem_size(const struct btf *btf,
> - const struct btf_param *arg,
> - const struct bpf_reg_state *reg)
> +bool btf_is_kfunc_arg_mem_size(const struct btf *btf,
> + const struct btf_param *arg,
> + const struct bpf_reg_state *reg)
> {
> int len, sfx_len = sizeof("__sz") - 1;
> const struct btf_type *t;
> @@ -5976,7 +5976,7 @@ static int btf_check_func_arg_match(struct bpf_verifier_env *env,
> reg_btf = reg->btf;
> reg_ref_id = reg->btf_id;
> /* Ensure only one argument is referenced
> - * PTR_TO_BTF_ID, check_func_arg_reg_off relies
> + * PTR_TO_BTF_ID or PTR_TO_MEM, check_func_arg_reg_off relies
> * on only one referenced register being allowed
> * for kfuncs.
> */
> @@ -6012,7 +6012,10 @@ static int btf_check_func_arg_match(struct bpf_verifier_env *env,
> u32 type_size;
>
> if (is_kfunc) {
> - bool arg_mem_size = i + 1 < nargs && is_kfunc_arg_mem_size(btf, &args[i + 1], ®s[regno + 1]);
> + bool arg_mem_size = i + 1 < nargs &&
> + btf_is_kfunc_arg_mem_size(btf,
> + &args[i + 1],
> + ®s[regno + 1]);
bpf allows ~100 chars. No need to break the line so much.
>
> /* Permit pointer to mem, but only when argument
> * type is pointer to scalar, or struct composed
> @@ -6039,6 +6042,24 @@ static int btf_check_func_arg_match(struct bpf_verifier_env *env,
> i++;
> continue;
> }
> +
> + if (rel && reg->ref_obj_id) {
> + /* Ensure only one argument is referenced
> + * PTR_TO_BTF_ID or PTR_TO_MEM, check_func_arg_reg_off
> + * relies on only one referenced register being allowed
> + * for kfuncs.
> + */
> + if (ref_obj_id) {
> + bpf_log(log,
> + "verifier internal error: more than one arg with ref_obj_id R%d %u %u\n",
> + regno,
> + reg->ref_obj_id,
> + ref_obj_id);
> + return -EFAULT;
> + }
> + ref_regno = regno;
> + ref_obj_id = reg->ref_obj_id;
> + }
> }
>
> resolve_ret = btf_resolve_size(btf, ref_t, &type_size);
> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> index 71827d14724a..0f339f9058f3 100644
> --- a/kernel/bpf/verifier.c
> +++ b/kernel/bpf/verifier.c
> @@ -6974,7 +6974,9 @@ static int check_kfunc_call(struct bpf_verifier_env *env, struct bpf_insn *insn,
> int err, insn_idx = *insn_idx_p;
> const struct btf_param *args;
> struct btf *desc_btf;
> + enum bpf_prog_type prog_type = resolve_prog_type(env->prog);
> bool acq;
> + size_t reg_size = 0;
>
> /* skip for now, but return error when we find this in fixup_kfunc_call */
> if (!insn->imm)
> @@ -7015,8 +7017,8 @@ static int check_kfunc_call(struct bpf_verifier_env *env, struct bpf_insn *insn,
> }
> }
>
> - for (i = 0; i < CALLER_SAVED_REGS; i++)
> - mark_reg_not_init(env, regs, caller_saved[i]);
> + /* reset REG_0 */
> + mark_reg_not_init(env, regs, BPF_REG_0);
>
> /* Check return type */
> t = btf_type_skip_modifiers(desc_btf, func_proto->type, NULL);
> @@ -7026,6 +7028,9 @@ static int check_kfunc_call(struct bpf_verifier_env *env, struct bpf_insn *insn,
> return -EINVAL;
> }
>
> + nargs = btf_type_vlen(func_proto);
> + args = btf_params(func_proto);
> +
> if (btf_type_is_scalar(t)) {
> mark_reg_unknown(env, regs, BPF_REG_0);
> mark_btf_func_reg_size(env, BPF_REG_0, t->size);
> @@ -7033,24 +7038,54 @@ static int check_kfunc_call(struct bpf_verifier_env *env, struct bpf_insn *insn,
> ptr_type = btf_type_skip_modifiers(desc_btf, t->type,
> &ptr_type_id);
> if (!btf_type_is_struct(ptr_type)) {
> - ptr_type_name = btf_name_by_offset(desc_btf,
> - ptr_type->name_off);
> - verbose(env, "kernel function %s returns pointer type %s %s is not supported\n",
> - func_name, btf_type_str(ptr_type),
> - ptr_type_name);
> - return -EINVAL;
> + /* if we have an array, we must have a const argument named "__sz" */
> + for (i = 0; i < nargs; i++) {
> + u32 regno = i + BPF_REG_1;
> + struct bpf_reg_state *reg = ®s[regno];
> +
> + /* look for any const scalar parameter of name "__sz" */
> + if (!check_reg_arg(env, regno, SRC_OP) &&
> + tnum_is_const(regs[regno].var_off) &&
> + btf_is_kfunc_arg_mem_size(desc_btf, &args[i], reg))
> + reg_size = regs[regno].var_off.value;
> + }
> +
> + if (!reg_size) {
> + ptr_type_name = btf_name_by_offset(desc_btf,
> + ptr_type->name_off);
> + verbose(env,
> + "kernel function %s returns pointer type %s %s is not supported\n",
> + func_name,
> + btf_type_str(ptr_type),
> + ptr_type_name);
> + return -EINVAL;
> + }
> +
> + mark_reg_known_zero(env, regs, BPF_REG_0);
> + regs[BPF_REG_0].type = PTR_TO_MEM;
> + regs[BPF_REG_0].mem_size = reg_size;
> +
> + /* in case of tracing, only allow write access to
> + * BPF_MODIFY_RETURN programs
> + */
> + if (prog_type == BPF_PROG_TYPE_TRACING &&
> + env->prog->expected_attach_type != BPF_MODIFY_RETURN)
> + regs[BPF_REG_0].type |= MEM_RDONLY;
MOD_RET restriction looks artificial.
We can distinguish readonly vs writeable PTR_TO_MEM based on
another naming convention.
Currently arg_name__sz applies to the previous argument.
Matching suffix made sense there.
Reusing the same suffix matching for a different purpose could be confusing.
For this use case we may reserve a full argument name.
Like "rdonly_buf_size" and "rdwr_buf_size" ?
On Thu, Apr 21, 2022 at 04:07:38PM +0200, Benjamin Tissoires wrote:
> Declare an entry point that can use fmod_ret BPF programs, and
> also an API to access and change the incoming data.
>
> Signed-off-by: Benjamin Tissoires <[email protected]>
>
> ---
>
> new-ish in v4:
> - far from complete, but gives an overview of what we can do now.
> ---
> drivers/hid/hid-core.c | 115 ++++++++++++++++++++++++++++++++++++++++
> include/linux/hid_bpf.h | 29 ++++++++++
> 2 files changed, 144 insertions(+)
> create mode 100644 include/linux/hid_bpf.h
>
> diff --git a/drivers/hid/hid-core.c b/drivers/hid/hid-core.c
> index db925794fbe6..ff4e1b47d2fb 100644
> --- a/drivers/hid/hid-core.c
> +++ b/drivers/hid/hid-core.c
> @@ -32,6 +32,9 @@
> #include <linux/hiddev.h>
> #include <linux/hid-debug.h>
> #include <linux/hidraw.h>
> +#include <linux/btf.h>
> +#include <linux/btf_ids.h>
> +#include <linux/hid_bpf.h>
>
> #include "hid-ids.h"
>
> @@ -2008,6 +2011,99 @@ int hid_report_raw_event(struct hid_device *hid, int type, u8 *data, u32 size,
> }
> EXPORT_SYMBOL_GPL(hid_report_raw_event);
>
> +struct hid_bpf_ctx_kern {
> + struct hid_device *hdev;
> + struct hid_bpf_ctx ctx;
> + u8 *data;
> + size_t size;
> +};
> +
> +__weak int hid_bpf_device_event(struct hid_bpf_ctx *ctx, s64 type)
> +{
> + return 0;
> +}
> +ALLOW_ERROR_INJECTION(hid_bpf_device_event, NS_ERRNO);
> +
> +noinline __u8 *
> +hid_bpf_kfunc_get_data(struct hid_bpf_ctx *ctx, unsigned int offset, const size_t __sz)
> +{
> + struct hid_bpf_ctx_kern *ctx_kern;
> +
> + if (!ctx)
> + return NULL;
> +
> + ctx_kern = container_of(ctx, struct hid_bpf_ctx_kern, ctx);
> +
> + return ctx_kern->data;
> +}
It probably needs to check that "rdonly_buf_sz" or "__sz" constant passed
by the program is less than what was allocated in hid_bpf_ctx_kern->data.
Right?
> +
> +noinline void
> +hid_bpf_kfunc_data_release(__u8 *data)
> +{
> +}
Not clear what release() is for.
hid_bpf_kfunc_get_data() will return PTR_TO_MEM.
There is no need to release it.
> +
> +noinline int
> +hid_bpf_kfunc_hw_request(struct hid_bpf_ctx *ctx)
> +{
> + if (!ctx)
> + return -EINVAL;
> +
> + pr_err("%s test ctx->bus: %04x %s:%d", __func__, ctx->bus, __FILE__, __LINE__);
> +
> + return 0;
> +}
> +
> +/*
> + * The following set contains all functions we agree BPF programs
> + * can use.
> + */
> +BTF_SET_START(hid_bpf_kfunc_ids)
> +BTF_ID(func, hid_bpf_kfunc_get_data)
> +BTF_ID(func, hid_bpf_kfunc_data_release)
> +BTF_ID(func, hid_bpf_kfunc_hw_request)
> +BTF_SET_END(hid_bpf_kfunc_ids)
> +
> +/*
> + * The following set contains all functions to provide a kernel
> + * resource to the BPF program.
> + * We need to add a counterpart release function.
> + */
> +BTF_SET_START(hid_bpf_kfunc_acquire_ids)
> +BTF_ID(func, hid_bpf_kfunc_get_data)
> +BTF_SET_END(hid_bpf_kfunc_acquire_ids)
> +
> +/*
> + * The following set is the release counterpart of the previous
> + * function set.
> + */
> +BTF_SET_START(hid_bpf_kfunc_release_ids)
> +BTF_ID(func, hid_bpf_kfunc_data_release)
> +BTF_SET_END(hid_bpf_kfunc_release_ids)
> +
> +/*
> + * The following set tells which functions are sleepable.
> + */
> +BTF_SET_START(hid_bpf_kfunc_sleepable_ids)
> +BTF_ID(func, hid_bpf_kfunc_hw_request)
> +BTF_SET_END(hid_bpf_kfunc_sleepable_ids)
> +
> +static const struct btf_kfunc_id_set hid_bpf_kfunc_set = {
> + .owner = THIS_MODULE,
> + .check_set = &hid_bpf_kfunc_ids,
> + .acquire_set = &hid_bpf_kfunc_acquire_ids,
> + .release_set = &hid_bpf_kfunc_release_ids,
> + .ret_null_set = &hid_bpf_kfunc_acquire_ids, /* duplicated to force BPF programs to
> + * check the validity of the returned pointer
> + * in acquire function
> + */
> + .sleepable_set = &hid_bpf_kfunc_sleepable_ids,
> +};
> +
> +static int __init hid_bpf_init(void)
> +{
> + return register_btf_kfunc_id_set(BPF_PROG_TYPE_TRACING, &hid_bpf_kfunc_set);
> +}
> +
> /**
> * hid_input_report - report data from lower layer (usb, bt...)
> *
> @@ -2025,6 +2121,17 @@ int hid_input_report(struct hid_device *hid, int type, u8 *data, u32 size, int i
> struct hid_driver *hdrv;
> struct hid_report *report;
> int ret = 0;
> + struct hid_bpf_ctx_kern ctx_kern = {
> + .hdev = hid,
> + .ctx = {
> + .bus = hid->bus,
> + .group = hid->group,
> + .vendor = hid->vendor,
> + .product = hid->product,
> + },
> + .data = data,
> + .size = size,
> + };
I'm not sure why hid_bpf_ctx_kern is needed.
Just to scope all args into one struct?
> +/*
> + * The following is the HID BPF API.
> + *
> + * It should be treated as UAPI, so extra care is required
> + * when making change to this file.
> + */
> +
> +struct hid_bpf_ctx {
> + __u16 bus; /* BUS ID */
> + __u16 group; /* Report group */
> + __u32 vendor; /* Vendor ID */
> + __u32 product; /* Product ID */
> + __u32 version; /* HID version */
> +};
Your choice of course,
but not clear why kernel has to populate it with explicit:
+ .bus = hid->bus,
+ .group = hid->group,
+ .vendor = hid->vendor,
+ .product = hid->product,
and waste cpu cycles on that while bpf program can read all these fields
on demand when necessary.
On Thu, Apr 21, 2022 at 04:07:33PM +0200, Benjamin Tissoires wrote:
> Hi,
>
> so after the reviews from v3, and some discussion with Alexei, I am
> back with a new version of HID-BPF.
>
> This version is not complete (thus the RFC), but I'd like to share
> it now to get initial feedback, in case I am too far from the actual
> goal.
>
> FTR, the goal is to provide some changes in the core verifier/btf so
> that we can plug in HID-BPF independently from BPF core. This way we can
> extend it without having to care about bpf-next.
Overall looks great. imo much cleaner, simpler and more extensible
than the earlier versions.
The bpf core extensions are nicely contained and HID side can be
worked on in parallel.
> The things I am not entirely sure are:
> - do we need only fentry/fexit/fmod_ret BPF program types or should
> programs that modify the data stream use a different kind?
Probably not. I'll reply in patch 2.
> - patch 3/7 is probably not the correct approach (see comments in the
> patch itself)
>
> We are missing quite a few bits here:
> - selftests for patches 1 to 4
> - add the ability to attach a program to a struct device, and run that
> program only for that struct device
yes. That is still to be figured out.
> - when running through bpf_prog_test_run_opts, how can we ensure we are
> talking to the correct device? (I have a feeling this is linked to the
> previous point)
> - how can we reconnect the device when a report descriptor fixup BPF
> program is loaded (would it make sense to allow some notifications on
> when a BPF program is attached/detached to a device, and which
> function have been traced?)
Not sure I follow. What kind of notification do you have in mind?
To user space?
On Thu, Apr 21, 2022 at 04:07:36PM +0200, Benjamin Tissoires wrote:
> When using error-injection function through bpf to change the return
> code, we need to know if the function is sleepable or not.
>
> Currently the code assumes that all error-inject functions are sleepable,
> except for a few selected of them, hardcoded in kernel/bpf/verifier.c
>
> Add a new flag to error-inject so we can code that information where the
> function is declared.
>
> Signed-off-by: Benjamin Tissoires <[email protected]>
>
> ---
>
> new in v4:
> - another approach would be to define a new kfunc_set, and register
> it with btf. But in that case, what program type would we use?
> BPF_PROG_TYPE_UNSPEC?
> - also note that maybe we should consider all of the functions
> non-sleepable and only mark some as sleepable. IMO it makes more
> sense to be more restrictive by default.
I think the approach in this patch is fine.
We didn't have issues with check_non_sleepable_error_inject() so far,
so I wouldn't start refactoring it.
> ---
> include/asm-generic/error-injection.h | 1 +
> kernel/bpf/verifier.c | 10 ++++++++--
> lib/error-inject.c | 2 ++
> 3 files changed, 11 insertions(+), 2 deletions(-)
>
> diff --git a/include/asm-generic/error-injection.h b/include/asm-generic/error-injection.h
> index fbca56bd9cbc..5974942353a6 100644
> --- a/include/asm-generic/error-injection.h
> +++ b/include/asm-generic/error-injection.h
> @@ -9,6 +9,7 @@ enum {
> EI_ETYPE_ERRNO, /* Return -ERRNO if failure */
> EI_ETYPE_ERRNO_NULL, /* Return -ERRNO or NULL if failure */
> EI_ETYPE_TRUE, /* Return true if failure */
> + EI_ETYPE_NS_ERRNO, /* Return -ERRNO if failure and tag the function as non-sleepable */
> };
>
> struct error_injection_entry {
> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> index 0f339f9058f3..45c8feea6478 100644
> --- a/kernel/bpf/verifier.c
> +++ b/kernel/bpf/verifier.c
> @@ -14085,6 +14085,11 @@ static int check_non_sleepable_error_inject(u32 btf_id)
> return btf_id_set_contains(&btf_non_sleepable_error_inject, btf_id);
> }
>
> +static int is_non_sleepable_error_inject(unsigned long addr)
> +{
> + return get_injectable_error_type(addr) == EI_ETYPE_NS_ERRNO;
It's a linear search. Probably ok. But would be good to double check
that we're not calling it a lot.
> +}
> +
> int bpf_check_attach_target(struct bpf_verifier_log *log,
> const struct bpf_prog *prog,
> const struct bpf_prog *tgt_prog,
> @@ -14281,8 +14286,9 @@ int bpf_check_attach_target(struct bpf_verifier_log *log,
> /* fentry/fexit/fmod_ret progs can be sleepable only if they are
> * attached to ALLOW_ERROR_INJECTION and are not in denylist.
> */
> - if (!check_non_sleepable_error_inject(btf_id) &&
> - within_error_injection_list(addr))
> + if (within_error_injection_list(addr) &&
> + !check_non_sleepable_error_inject(btf_id) &&
> + !is_non_sleepable_error_inject(addr))
> ret = 0;
> break;
> case BPF_PROG_TYPE_LSM:
> diff --git a/lib/error-inject.c b/lib/error-inject.c
> index 2ff5ef689d72..560c3b18f439 100644
> --- a/lib/error-inject.c
> +++ b/lib/error-inject.c
> @@ -183,6 +183,8 @@ static const char *error_type_string(int etype)
> return "ERRNO_NULL";
> case EI_ETYPE_TRUE:
> return "TRUE";
> + case EI_ETYPE_NS_ERRNO:
> + return "NS_ERRNO";
> default:
> return "(unknown)";
> }
> --
> 2.35.1
>
On Tue, Apr 26, 2022 at 6:03 AM Alexei Starovoitov
<[email protected]> wrote:
>
> On Thu, Apr 21, 2022 at 04:07:33PM +0200, Benjamin Tissoires wrote:
> > Hi,
> >
> > so after the reviews from v3, and some discussion with Alexei, I am
> > back with a new version of HID-BPF.
> >
> > This version is not complete (thus the RFC), but I'd like to share
> > it now to get initial feedback, in case I am too far from the actual
> > goal.
> >
> > FTR, the goal is to provide some changes in the core verifier/btf so
> > that we can plug in HID-BPF independently from BPF core. This way we can
> > extend it without having to care about bpf-next.
>
> Overall looks great. imo much cleaner, simpler and more extensible
> than the earlier versions.
> The bpf core extensions are nicely contained and HID side can be
> worked on in parallel.
\o/
>
> > The things I am not entirely sure are:
> > - do we need only fentry/fexit/fmod_ret BPF program types or should
> > programs that modify the data stream use a different kind?
>
> Probably not. I'll reply in patch 2.
>
> > - patch 3/7 is probably not the correct approach (see comments in the
> > patch itself)
> >
> > We are missing quite a few bits here:
> > - selftests for patches 1 to 4
> > - add the ability to attach a program to a struct device, and run that
> > program only for that struct device
>
> yes. That is still to be figured out.
I spent some time on that, and I don't think it makes a lot of sense
to use the current trampoline approach if we want to keep on using
fentry/fexit...
- the trampoline is pretty nice, but it adds instructions before
calling the actual function, meaning that adding a check on struct
device will be quite hard to do ()we have no idea where the struct
device is in the arguments) and will take more space on the trampoline
itself
- there is a limit on how many functions can be attached to a
trampoline (38 IIRC), and we probably will explode that number quickly
enough when we get more BPF programs to support HID devices.
So my chain of thoughts from yesterday was the following (completely
untested of course):
- instead of writing a new BPF API that might move in the future while
things are settling, I can actually simply load a tracer BPF program
from HID that monitors the BPF programs that are attached to a given
function
- I can also add a new API (a kfunc likely) that "registers" a given
BPF program (through its fd) to a given HID device
- when a device sends data, it hits hid_bpf_device_event() which will
have a default BPF program (loaded by the kernel) that dispatches the
registered BPF programs based on the HID device.
This would solve the 2 issues above IMO, except that the kfunc to
register a HID BPF program will suddenly be not standard.
>
> > - when running through bpf_prog_test_run_opts, how can we ensure we are
> > talking to the correct device? (I have a feeling this is linked to the
> > previous point)
> > - how can we reconnect the device when a report descriptor fixup BPF
> > program is loaded (would it make sense to allow some notifications on
> > when a BPF program is attached/detached to a device, and which
> > function have been traced?)
>
> Not sure I follow. What kind of notification do you have in mind?
> To user space?
>
No, this is in-kernel notifications.
What I want to do, is when I load a BPF program that changes the HID
report descriptor, hid-core detects that and reconnects the attached
device.
But after a couple of days of thinking, and with the above approach
where HID would preload a BPF program, I should be able to achieve
that with the "register BPF through a HID kfunc call". When I see that
we are attaching a HID report descriptor fixup to a given HID device,
I can then reconnect the matching device.
It would certainly be cleaner to have a general "notify me when a
tracer is attached to this particular function", but we can hide that
right now with a preloaded BPF program :)
Cheers,
Benjamin
On Tue, Apr 26, 2022 at 6:11 AM Alexei Starovoitov
<[email protected]> wrote:
>
> On Thu, Apr 21, 2022 at 04:07:36PM +0200, Benjamin Tissoires wrote:
> > When using error-injection function through bpf to change the return
> > code, we need to know if the function is sleepable or not.
> >
> > Currently the code assumes that all error-inject functions are sleepable,
> > except for a few selected of them, hardcoded in kernel/bpf/verifier.c
> >
> > Add a new flag to error-inject so we can code that information where the
> > function is declared.
> >
> > Signed-off-by: Benjamin Tissoires <[email protected]>
> >
> > ---
> >
> > new in v4:
> > - another approach would be to define a new kfunc_set, and register
> > it with btf. But in that case, what program type would we use?
> > BPF_PROG_TYPE_UNSPEC?
> > - also note that maybe we should consider all of the functions
> > non-sleepable and only mark some as sleepable. IMO it makes more
> > sense to be more restrictive by default.
>
> I think the approach in this patch is fine.
> We didn't have issues with check_non_sleepable_error_inject() so far,
> so I wouldn't start refactoring it.
OK... though I can't help but thinking that adding a new
error-inject.h enum value is going to be bad, because it's an API
change, and users might not expect NS_ERRNO.
OTOH, if we had a new kfunc_set, we keep the existing error-inject API
in place with all the variants and we just teach the verifier that the
function is non sleepable.
>
> > ---
> > include/asm-generic/error-injection.h | 1 +
> > kernel/bpf/verifier.c | 10 ++++++++--
> > lib/error-inject.c | 2 ++
> > 3 files changed, 11 insertions(+), 2 deletions(-)
> >
> > diff --git a/include/asm-generic/error-injection.h b/include/asm-generic/error-injection.h
> > index fbca56bd9cbc..5974942353a6 100644
> > --- a/include/asm-generic/error-injection.h
> > +++ b/include/asm-generic/error-injection.h
> > @@ -9,6 +9,7 @@ enum {
> > EI_ETYPE_ERRNO, /* Return -ERRNO if failure */
> > EI_ETYPE_ERRNO_NULL, /* Return -ERRNO or NULL if failure */
> > EI_ETYPE_TRUE, /* Return true if failure */
> > + EI_ETYPE_NS_ERRNO, /* Return -ERRNO if failure and tag the function as non-sleepable */
>
> > };
> >
> > struct error_injection_entry {
> > diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> > index 0f339f9058f3..45c8feea6478 100644
> > --- a/kernel/bpf/verifier.c
> > +++ b/kernel/bpf/verifier.c
> > @@ -14085,6 +14085,11 @@ static int check_non_sleepable_error_inject(u32 btf_id)
> > return btf_id_set_contains(&btf_non_sleepable_error_inject, btf_id);
> > }
> >
> > +static int is_non_sleepable_error_inject(unsigned long addr)
> > +{
> > + return get_injectable_error_type(addr) == EI_ETYPE_NS_ERRNO;
>
> It's a linear search. Probably ok. But would be good to double check
> that we're not calling it a lot.
IIUC, the kfunc_set approach would solve that, no?
Cheers,
Benjamin
>
> > +}
> > +
> > int bpf_check_attach_target(struct bpf_verifier_log *log,
> > const struct bpf_prog *prog,
> > const struct bpf_prog *tgt_prog,
> > @@ -14281,8 +14286,9 @@ int bpf_check_attach_target(struct bpf_verifier_log *log,
> > /* fentry/fexit/fmod_ret progs can be sleepable only if they are
> > * attached to ALLOW_ERROR_INJECTION and are not in denylist.
> > */
> > - if (!check_non_sleepable_error_inject(btf_id) &&
> > - within_error_injection_list(addr))
> > + if (within_error_injection_list(addr) &&
> > + !check_non_sleepable_error_inject(btf_id) &&
> > + !is_non_sleepable_error_inject(addr))
> > ret = 0;
> > break;
> > case BPF_PROG_TYPE_LSM:
> > diff --git a/lib/error-inject.c b/lib/error-inject.c
> > index 2ff5ef689d72..560c3b18f439 100644
> > --- a/lib/error-inject.c
> > +++ b/lib/error-inject.c
> > @@ -183,6 +183,8 @@ static const char *error_type_string(int etype)
> > return "ERRNO_NULL";
> > case EI_ETYPE_TRUE:
> > return "TRUE";
> > + case EI_ETYPE_NS_ERRNO:
> > + return "NS_ERRNO";
> > default:
> > return "(unknown)";
> > }
> > --
> > 2.35.1
> >
>
On Tue, Apr 26, 2022 at 6:09 AM Alexei Starovoitov
<[email protected]> wrote:
>
> On Thu, Apr 21, 2022 at 04:07:35PM +0200, Benjamin Tissoires wrote:
> > When a kfunc is not returning a pointer to a struct but to a plain type,
> > check if one of the arguments is called __sz and is a const from the
> > caller, and use this as the size of the allocated memory.
> >
> > For tracing programs, we consider the provided memory to be read only
> > unless the program is BPF_MODIFY_RETURN.
> >
> > Signed-off-by: Benjamin Tissoires <[email protected]>
> >
> > ---
> >
> > new in v4
> > ---
> > include/linux/btf.h | 6 ++++
> > kernel/bpf/btf.c | 31 ++++++++++++++++----
> > kernel/bpf/verifier.c | 66 +++++++++++++++++++++++++++++++++----------
> > 3 files changed, 83 insertions(+), 20 deletions(-)
> >
> > diff --git a/include/linux/btf.h b/include/linux/btf.h
> > index 36bc09b8e890..76a3ff48ae2a 100644
> > --- a/include/linux/btf.h
> > +++ b/include/linux/btf.h
> > @@ -332,6 +332,12 @@ static inline struct btf_param *btf_params(const struct btf_type *t)
> > return (struct btf_param *)(t + 1);
> > }
> >
> > +struct bpf_reg_state;
> > +
> > +bool btf_is_kfunc_arg_mem_size(const struct btf *btf,
> > + const struct btf_param *arg,
> > + const struct bpf_reg_state *reg);
> > +
> > #ifdef CONFIG_BPF_SYSCALL
> > struct bpf_prog;
> >
> > diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c
> > index 76318a4c2d0e..22e6e3cdc7ee 100644
> > --- a/kernel/bpf/btf.c
> > +++ b/kernel/bpf/btf.c
> > @@ -5851,9 +5851,9 @@ static bool __btf_type_is_scalar_struct(struct bpf_verifier_log *log,
> > return true;
> > }
> >
> > -static bool is_kfunc_arg_mem_size(const struct btf *btf,
> > - const struct btf_param *arg,
> > - const struct bpf_reg_state *reg)
> > +bool btf_is_kfunc_arg_mem_size(const struct btf *btf,
> > + const struct btf_param *arg,
> > + const struct bpf_reg_state *reg)
> > {
> > int len, sfx_len = sizeof("__sz") - 1;
> > const struct btf_type *t;
> > @@ -5976,7 +5976,7 @@ static int btf_check_func_arg_match(struct bpf_verifier_env *env,
> > reg_btf = reg->btf;
> > reg_ref_id = reg->btf_id;
> > /* Ensure only one argument is referenced
> > - * PTR_TO_BTF_ID, check_func_arg_reg_off relies
> > + * PTR_TO_BTF_ID or PTR_TO_MEM, check_func_arg_reg_off relies
> > * on only one referenced register being allowed
> > * for kfuncs.
> > */
> > @@ -6012,7 +6012,10 @@ static int btf_check_func_arg_match(struct bpf_verifier_env *env,
> > u32 type_size;
> >
> > if (is_kfunc) {
> > - bool arg_mem_size = i + 1 < nargs && is_kfunc_arg_mem_size(btf, &args[i + 1], ®s[regno + 1]);
> > + bool arg_mem_size = i + 1 < nargs &&
> > + btf_is_kfunc_arg_mem_size(btf,
> > + &args[i + 1],
> > + ®s[regno + 1]);
>
> bpf allows ~100 chars. No need to break the line so much.
>
> >
> > /* Permit pointer to mem, but only when argument
> > * type is pointer to scalar, or struct composed
> > @@ -6039,6 +6042,24 @@ static int btf_check_func_arg_match(struct bpf_verifier_env *env,
> > i++;
> > continue;
> > }
> > +
> > + if (rel && reg->ref_obj_id) {
> > + /* Ensure only one argument is referenced
> > + * PTR_TO_BTF_ID or PTR_TO_MEM, check_func_arg_reg_off
> > + * relies on only one referenced register being allowed
> > + * for kfuncs.
> > + */
> > + if (ref_obj_id) {
> > + bpf_log(log,
> > + "verifier internal error: more than one arg with ref_obj_id R%d %u %u\n",
> > + regno,
> > + reg->ref_obj_id,
> > + ref_obj_id);
> > + return -EFAULT;
> > + }
> > + ref_regno = regno;
> > + ref_obj_id = reg->ref_obj_id;
> > + }
> > }
> >
> > resolve_ret = btf_resolve_size(btf, ref_t, &type_size);
> > diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> > index 71827d14724a..0f339f9058f3 100644
> > --- a/kernel/bpf/verifier.c
> > +++ b/kernel/bpf/verifier.c
> > @@ -6974,7 +6974,9 @@ static int check_kfunc_call(struct bpf_verifier_env *env, struct bpf_insn *insn,
> > int err, insn_idx = *insn_idx_p;
> > const struct btf_param *args;
> > struct btf *desc_btf;
> > + enum bpf_prog_type prog_type = resolve_prog_type(env->prog);
> > bool acq;
> > + size_t reg_size = 0;
> >
> > /* skip for now, but return error when we find this in fixup_kfunc_call */
> > if (!insn->imm)
> > @@ -7015,8 +7017,8 @@ static int check_kfunc_call(struct bpf_verifier_env *env, struct bpf_insn *insn,
> > }
> > }
> >
> > - for (i = 0; i < CALLER_SAVED_REGS; i++)
> > - mark_reg_not_init(env, regs, caller_saved[i]);
> > + /* reset REG_0 */
> > + mark_reg_not_init(env, regs, BPF_REG_0);
> >
> > /* Check return type */
> > t = btf_type_skip_modifiers(desc_btf, func_proto->type, NULL);
> > @@ -7026,6 +7028,9 @@ static int check_kfunc_call(struct bpf_verifier_env *env, struct bpf_insn *insn,
> > return -EINVAL;
> > }
> >
> > + nargs = btf_type_vlen(func_proto);
> > + args = btf_params(func_proto);
> > +
> > if (btf_type_is_scalar(t)) {
> > mark_reg_unknown(env, regs, BPF_REG_0);
> > mark_btf_func_reg_size(env, BPF_REG_0, t->size);
> > @@ -7033,24 +7038,54 @@ static int check_kfunc_call(struct bpf_verifier_env *env, struct bpf_insn *insn,
> > ptr_type = btf_type_skip_modifiers(desc_btf, t->type,
> > &ptr_type_id);
> > if (!btf_type_is_struct(ptr_type)) {
> > - ptr_type_name = btf_name_by_offset(desc_btf,
> > - ptr_type->name_off);
> > - verbose(env, "kernel function %s returns pointer type %s %s is not supported\n",
> > - func_name, btf_type_str(ptr_type),
> > - ptr_type_name);
> > - return -EINVAL;
> > + /* if we have an array, we must have a const argument named "__sz" */
> > + for (i = 0; i < nargs; i++) {
> > + u32 regno = i + BPF_REG_1;
> > + struct bpf_reg_state *reg = ®s[regno];
> > +
> > + /* look for any const scalar parameter of name "__sz" */
> > + if (!check_reg_arg(env, regno, SRC_OP) &&
> > + tnum_is_const(regs[regno].var_off) &&
> > + btf_is_kfunc_arg_mem_size(desc_btf, &args[i], reg))
> > + reg_size = regs[regno].var_off.value;
> > + }
> > +
> > + if (!reg_size) {
> > + ptr_type_name = btf_name_by_offset(desc_btf,
> > + ptr_type->name_off);
> > + verbose(env,
> > + "kernel function %s returns pointer type %s %s is not supported\n",
> > + func_name,
> > + btf_type_str(ptr_type),
> > + ptr_type_name);
> > + return -EINVAL;
> > + }
> > +
> > + mark_reg_known_zero(env, regs, BPF_REG_0);
> > + regs[BPF_REG_0].type = PTR_TO_MEM;
> > + regs[BPF_REG_0].mem_size = reg_size;
> > +
> > + /* in case of tracing, only allow write access to
> > + * BPF_MODIFY_RETURN programs
> > + */
> > + if (prog_type == BPF_PROG_TYPE_TRACING &&
> > + env->prog->expected_attach_type != BPF_MODIFY_RETURN)
> > + regs[BPF_REG_0].type |= MEM_RDONLY;
>
> MOD_RET restriction looks artificial.
> We can distinguish readonly vs writeable PTR_TO_MEM based on
> another naming convention.
> Currently arg_name__sz applies to the previous argument.
> Matching suffix made sense there.
Oh, I missed the point of the "__sz". I did not realize it was
supposed to be a suffix.
> Reusing the same suffix matching for a different purpose could be confusing.
> For this use case we may reserve a full argument name.
> Like "rdonly_buf_size" and "rdwr_buf_size" ?
>
I like the idea but I have 2 problems here:
1. I do not really want to have 2 separate kfuncs for read only and
write operations
2. How can I restrict the write operation to fmod_ret?
For 1, my guess is that the read-only operation will not be used
unless we solve 2.
For 2, the rationale is that I think tracing functions are not
supposed to change the behavior. This was said on the thread about
priorities for BPF programs. And it somehow makes sense that fentry
should be used for tracing only. OTOH, fmod_ret is clearly affecting
the behavior of the program, so I see it more "natural" that it can
change the context too.
Cheers,
Benjamin
On Tue, Apr 26, 2022 at 12:20 AM Benjamin Tissoires
<[email protected]> wrote:
>
> On Tue, Apr 26, 2022 at 6:03 AM Alexei Starovoitov
> <[email protected]> wrote:
> >
> > On Thu, Apr 21, 2022 at 04:07:33PM +0200, Benjamin Tissoires wrote:
> > > Hi,
> > >
> > > so after the reviews from v3, and some discussion with Alexei, I am
> > > back with a new version of HID-BPF.
> > >
> > > This version is not complete (thus the RFC), but I'd like to share
> > > it now to get initial feedback, in case I am too far from the actual
> > > goal.
> > >
> > > FTR, the goal is to provide some changes in the core verifier/btf so
> > > that we can plug in HID-BPF independently from BPF core. This way we can
> > > extend it without having to care about bpf-next.
> >
> > Overall looks great. imo much cleaner, simpler and more extensible
> > than the earlier versions.
> > The bpf core extensions are nicely contained and HID side can be
> > worked on in parallel.
>
> \o/
>
> >
> > > The things I am not entirely sure are:
> > > - do we need only fentry/fexit/fmod_ret BPF program types or should
> > > programs that modify the data stream use a different kind?
> >
> > Probably not. I'll reply in patch 2.
> >
> > > - patch 3/7 is probably not the correct approach (see comments in the
> > > patch itself)
> > >
> > > We are missing quite a few bits here:
> > > - selftests for patches 1 to 4
> > > - add the ability to attach a program to a struct device, and run that
> > > program only for that struct device
> >
> > yes. That is still to be figured out.
>
> I spent some time on that, and I don't think it makes a lot of sense
> to use the current trampoline approach if we want to keep on using
> fentry/fexit...
> - the trampoline is pretty nice, but it adds instructions before
> calling the actual function, meaning that adding a check on struct
> device will be quite hard to do ()we have no idea where the struct
> device is in the arguments) and will take more space on the trampoline
> itself
> - there is a limit on how many functions can be attached to a
> trampoline (38 IIRC), and we probably will explode that number quickly
> enough when we get more BPF programs to support HID devices.
Ohh. This is an obsolete limitation.
38 was the number since we used half page optimization
for bpf trampoline.
It's gone now. We can easily lift this max.
> So my chain of thoughts from yesterday was the following (completely
> untested of course):
> - instead of writing a new BPF API that might move in the future while
> things are settling, I can actually simply load a tracer BPF program
> from HID that monitors the BPF programs that are attached to a given
> function
> - I can also add a new API (a kfunc likely) that "registers" a given
> BPF program (through its fd) to a given HID device
> - when a device sends data, it hits hid_bpf_device_event() which will
> have a default BPF program (loaded by the kernel) that dispatches the
> registered BPF programs based on the HID device.
>
> This would solve the 2 issues above IMO, except that the kfunc to
> register a HID BPF program will suddenly be not standard.
Could you add more details to these ideas?
I thought you wanted bpf prog writers to be independent of each other.
They would tell some framework HID device id/pcie id that they need
and the rest would be automatic.
Maybe we can achieve that by adding another layer before libbpf
that would accept (bpf_prog, hid_id) tuple and insert
if (hid->id != hid_id) return -E..;
as the first insn into bpf_prog before loading into the kernel.
All such progs will be kfunc-s attached to the same hook.
The kernel will execute them sequentially.
The framework will provide demux by auto-inserting this 'if'.
This 'if (hid)' could be a part of sample code too.
We can simply ask prog writers to follow this style.
Another idea would be to do something like libxdp.
Attach a "dispatcher" bpf prog to the kfunc hook and use
some library to attach hid-specific progs as "freplace" kind of
programs. It's a more involved solution.
Another option is to use tail_calls.
If hid_id is a relatively small number. The "dispatcher" bpf prog
can do bpf_tail_call(prog_array, hid_id)
while hid specific progs insert itself into prog_array
instead of attaching to kfunc.
> >
> > > - when running through bpf_prog_test_run_opts, how can we ensure we are
> > > talking to the correct device? (I have a feeling this is linked to the
> > > previous point)
> > > - how can we reconnect the device when a report descriptor fixup BPF
> > > program is loaded (would it make sense to allow some notifications on
> > > when a BPF program is attached/detached to a device, and which
> > > function have been traced?)
> >
> > Not sure I follow. What kind of notification do you have in mind?
> > To user space?
> >
>
> No, this is in-kernel notifications.
> What I want to do, is when I load a BPF program that changes the HID
> report descriptor, hid-core detects that and reconnects the attached
> device.
>
> But after a couple of days of thinking, and with the above approach
> where HID would preload a BPF program, I should be able to achieve
> that with the "register BPF through a HID kfunc call". When I see that
> we are attaching a HID report descriptor fixup to a given HID device,
> I can then reconnect the matching device.
>
> It would certainly be cleaner to have a general "notify me when a
> tracer is attached to this particular function", but we can hide that
> right now with a preloaded BPF program :)
There are few lsm hooks in bpf core. It probably wwill be eird
for hid core to hook into them. We can add a few tracepoints
at attach functions if that helps.
On Sat, Apr 30, 2022 at 5:26 AM Alexei Starovoitov
<[email protected]> wrote:
>
> On Tue, Apr 26, 2022 at 12:30 AM Benjamin Tissoires
> <[email protected]> wrote:
> >
> > On Tue, Apr 26, 2022 at 6:09 AM Alexei Starovoitov
> > <[email protected]> wrote:
> > >
> > > On Thu, Apr 21, 2022 at 04:07:35PM +0200, Benjamin Tissoires wrote:
> > > > When a kfunc is not returning a pointer to a struct but to a plain type,
> > > > check if one of the arguments is called __sz and is a const from the
> > > > caller, and use this as the size of the allocated memory.
> > > >
> > > > For tracing programs, we consider the provided memory to be read only
> > > > unless the program is BPF_MODIFY_RETURN.
> > > >
> > > > Signed-off-by: Benjamin Tissoires <[email protected]>
> > > >
> > > > ---
> > > >
> > > > new in v4
> > > > ---
> > > > include/linux/btf.h | 6 ++++
> > > > kernel/bpf/btf.c | 31 ++++++++++++++++----
> > > > kernel/bpf/verifier.c | 66 +++++++++++++++++++++++++++++++++----------
> > > > 3 files changed, 83 insertions(+), 20 deletions(-)
> > > >
> > > > diff --git a/include/linux/btf.h b/include/linux/btf.h
> > > > index 36bc09b8e890..76a3ff48ae2a 100644
> > > > --- a/include/linux/btf.h
> > > > +++ b/include/linux/btf.h
> > > > @@ -332,6 +332,12 @@ static inline struct btf_param *btf_params(const struct btf_type *t)
> > > > return (struct btf_param *)(t + 1);
> > > > }
> > > >
> > > > +struct bpf_reg_state;
> > > > +
> > > > +bool btf_is_kfunc_arg_mem_size(const struct btf *btf,
> > > > + const struct btf_param *arg,
> > > > + const struct bpf_reg_state *reg);
> > > > +
> > > > #ifdef CONFIG_BPF_SYSCALL
> > > > struct bpf_prog;
> > > >
> > > > diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c
> > > > index 76318a4c2d0e..22e6e3cdc7ee 100644
> > > > --- a/kernel/bpf/btf.c
> > > > +++ b/kernel/bpf/btf.c
> > > > @@ -5851,9 +5851,9 @@ static bool __btf_type_is_scalar_struct(struct bpf_verifier_log *log,
> > > > return true;
> > > > }
> > > >
> > > > -static bool is_kfunc_arg_mem_size(const struct btf *btf,
> > > > - const struct btf_param *arg,
> > > > - const struct bpf_reg_state *reg)
> > > > +bool btf_is_kfunc_arg_mem_size(const struct btf *btf,
> > > > + const struct btf_param *arg,
> > > > + const struct bpf_reg_state *reg)
> > > > {
> > > > int len, sfx_len = sizeof("__sz") - 1;
> > > > const struct btf_type *t;
> > > > @@ -5976,7 +5976,7 @@ static int btf_check_func_arg_match(struct bpf_verifier_env *env,
> > > > reg_btf = reg->btf;
> > > > reg_ref_id = reg->btf_id;
> > > > /* Ensure only one argument is referenced
> > > > - * PTR_TO_BTF_ID, check_func_arg_reg_off relies
> > > > + * PTR_TO_BTF_ID or PTR_TO_MEM, check_func_arg_reg_off relies
> > > > * on only one referenced register being allowed
> > > > * for kfuncs.
> > > > */
> > > > @@ -6012,7 +6012,10 @@ static int btf_check_func_arg_match(struct bpf_verifier_env *env,
> > > > u32 type_size;
> > > >
> > > > if (is_kfunc) {
> > > > - bool arg_mem_size = i + 1 < nargs && is_kfunc_arg_mem_size(btf, &args[i + 1], ®s[regno + 1]);
> > > > + bool arg_mem_size = i + 1 < nargs &&
> > > > + btf_is_kfunc_arg_mem_size(btf,
> > > > + &args[i + 1],
> > > > + ®s[regno + 1]);
> > >
> > > bpf allows ~100 chars. No need to break the line so much.
> > >
> > > >
> > > > /* Permit pointer to mem, but only when argument
> > > > * type is pointer to scalar, or struct composed
> > > > @@ -6039,6 +6042,24 @@ static int btf_check_func_arg_match(struct bpf_verifier_env *env,
> > > > i++;
> > > > continue;
> > > > }
> > > > +
> > > > + if (rel && reg->ref_obj_id) {
> > > > + /* Ensure only one argument is referenced
> > > > + * PTR_TO_BTF_ID or PTR_TO_MEM, check_func_arg_reg_off
> > > > + * relies on only one referenced register being allowed
> > > > + * for kfuncs.
> > > > + */
> > > > + if (ref_obj_id) {
> > > > + bpf_log(log,
> > > > + "verifier internal error: more than one arg with ref_obj_id R%d %u %u\n",
> > > > + regno,
> > > > + reg->ref_obj_id,
> > > > + ref_obj_id);
> > > > + return -EFAULT;
> > > > + }
> > > > + ref_regno = regno;
> > > > + ref_obj_id = reg->ref_obj_id;
> > > > + }
> > > > }
> > > >
> > > > resolve_ret = btf_resolve_size(btf, ref_t, &type_size);
> > > > diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> > > > index 71827d14724a..0f339f9058f3 100644
> > > > --- a/kernel/bpf/verifier.c
> > > > +++ b/kernel/bpf/verifier.c
> > > > @@ -6974,7 +6974,9 @@ static int check_kfunc_call(struct bpf_verifier_env *env, struct bpf_insn *insn,
> > > > int err, insn_idx = *insn_idx_p;
> > > > const struct btf_param *args;
> > > > struct btf *desc_btf;
> > > > + enum bpf_prog_type prog_type = resolve_prog_type(env->prog);
> > > > bool acq;
> > > > + size_t reg_size = 0;
> > > >
> > > > /* skip for now, but return error when we find this in fixup_kfunc_call */
> > > > if (!insn->imm)
> > > > @@ -7015,8 +7017,8 @@ static int check_kfunc_call(struct bpf_verifier_env *env, struct bpf_insn *insn,
> > > > }
> > > > }
> > > >
> > > > - for (i = 0; i < CALLER_SAVED_REGS; i++)
> > > > - mark_reg_not_init(env, regs, caller_saved[i]);
> > > > + /* reset REG_0 */
> > > > + mark_reg_not_init(env, regs, BPF_REG_0);
> > > >
> > > > /* Check return type */
> > > > t = btf_type_skip_modifiers(desc_btf, func_proto->type, NULL);
> > > > @@ -7026,6 +7028,9 @@ static int check_kfunc_call(struct bpf_verifier_env *env, struct bpf_insn *insn,
> > > > return -EINVAL;
> > > > }
> > > >
> > > > + nargs = btf_type_vlen(func_proto);
> > > > + args = btf_params(func_proto);
> > > > +
> > > > if (btf_type_is_scalar(t)) {
> > > > mark_reg_unknown(env, regs, BPF_REG_0);
> > > > mark_btf_func_reg_size(env, BPF_REG_0, t->size);
> > > > @@ -7033,24 +7038,54 @@ static int check_kfunc_call(struct bpf_verifier_env *env, struct bpf_insn *insn,
> > > > ptr_type = btf_type_skip_modifiers(desc_btf, t->type,
> > > > &ptr_type_id);
> > > > if (!btf_type_is_struct(ptr_type)) {
> > > > - ptr_type_name = btf_name_by_offset(desc_btf,
> > > > - ptr_type->name_off);
> > > > - verbose(env, "kernel function %s returns pointer type %s %s is not supported\n",
> > > > - func_name, btf_type_str(ptr_type),
> > > > - ptr_type_name);
> > > > - return -EINVAL;
> > > > + /* if we have an array, we must have a const argument named "__sz" */
> > > > + for (i = 0; i < nargs; i++) {
> > > > + u32 regno = i + BPF_REG_1;
> > > > + struct bpf_reg_state *reg = ®s[regno];
> > > > +
> > > > + /* look for any const scalar parameter of name "__sz" */
> > > > + if (!check_reg_arg(env, regno, SRC_OP) &&
> > > > + tnum_is_const(regs[regno].var_off) &&
> > > > + btf_is_kfunc_arg_mem_size(desc_btf, &args[i], reg))
> > > > + reg_size = regs[regno].var_off.value;
> > > > + }
> > > > +
> > > > + if (!reg_size) {
> > > > + ptr_type_name = btf_name_by_offset(desc_btf,
> > > > + ptr_type->name_off);
> > > > + verbose(env,
> > > > + "kernel function %s returns pointer type %s %s is not supported\n",
> > > > + func_name,
> > > > + btf_type_str(ptr_type),
> > > > + ptr_type_name);
> > > > + return -EINVAL;
> > > > + }
> > > > +
> > > > + mark_reg_known_zero(env, regs, BPF_REG_0);
> > > > + regs[BPF_REG_0].type = PTR_TO_MEM;
> > > > + regs[BPF_REG_0].mem_size = reg_size;
> > > > +
> > > > + /* in case of tracing, only allow write access to
> > > > + * BPF_MODIFY_RETURN programs
> > > > + */
> > > > + if (prog_type == BPF_PROG_TYPE_TRACING &&
> > > > + env->prog->expected_attach_type != BPF_MODIFY_RETURN)
> > > > + regs[BPF_REG_0].type |= MEM_RDONLY;
> > >
> > > MOD_RET restriction looks artificial.
> > > We can distinguish readonly vs writeable PTR_TO_MEM based on
> > > another naming convention.
> > > Currently arg_name__sz applies to the previous argument.
> > > Matching suffix made sense there.
> >
> > Oh, I missed the point of the "__sz". I did not realize it was
> > supposed to be a suffix.
> >
> > > Reusing the same suffix matching for a different purpose could be confusing.
> > > For this use case we may reserve a full argument name.
> > > Like "rdonly_buf_size" and "rdwr_buf_size" ?
> > >
> >
> > I like the idea but I have 2 problems here:
> > 1. I do not really want to have 2 separate kfuncs for read only and
> > write operations
> > 2. How can I restrict the write operation to fmod_ret?
> >
> > For 1, my guess is that the read-only operation will not be used
> > unless we solve 2.
> > For 2, the rationale is that I think tracing functions are not
> > supposed to change the behavior. This was said on the thread about
> > priorities for BPF programs. And it somehow makes sense that fentry
> > should be used for tracing only. OTOH, fmod_ret is clearly affecting
> > the behavior of the program, so I see it more "natural" that it can
> > change the context too.
>
> Well, if we say that fentry is rdonly and fmod_ret is rdwr
> then we probably shouldn't stop at return value.
Yeah, it makes sense, but it will be a slightly bigger effort.
> If bpf prog can access the argument and this argument is an array
> it should be writable.
> We can allow different kfuncs for fentry and fmod_ret too.
Good idea. No idea if this will be easy to implement though :)
But after the prototype I describe in 0/7 I don't really need to
enforce if read or write is based on fentry/fmod_ret.
So I think let's not over engineer this, and stick to your initial
suggestion of the naming system which will be largely enough for my
use case.
> They can be two tiny wrappers with different arg names (to distinguish
> rdonly vs rdwr) on top of the single always_inline function
> that returns a buffer.
>
Cheers,
Benjamin
On Tue, Apr 26, 2022 at 12:52 AM Benjamin Tissoires
<[email protected]> wrote:
>
> On Tue, Apr 26, 2022 at 6:11 AM Alexei Starovoitov
> <[email protected]> wrote:
> >
> > On Thu, Apr 21, 2022 at 04:07:36PM +0200, Benjamin Tissoires wrote:
> > > When using error-injection function through bpf to change the return
> > > code, we need to know if the function is sleepable or not.
> > >
> > > Currently the code assumes that all error-inject functions are sleepable,
> > > except for a few selected of them, hardcoded in kernel/bpf/verifier.c
> > >
> > > Add a new flag to error-inject so we can code that information where the
> > > function is declared.
> > >
> > > Signed-off-by: Benjamin Tissoires <[email protected]>
> > >
> > > ---
> > >
> > > new in v4:
> > > - another approach would be to define a new kfunc_set, and register
> > > it with btf. But in that case, what program type would we use?
> > > BPF_PROG_TYPE_UNSPEC?
> > > - also note that maybe we should consider all of the functions
> > > non-sleepable and only mark some as sleepable. IMO it makes more
> > > sense to be more restrictive by default.
> >
> > I think the approach in this patch is fine.
> > We didn't have issues with check_non_sleepable_error_inject() so far,
> > so I wouldn't start refactoring it.
>
> OK... though I can't help but thinking that adding a new
> error-inject.h enum value is going to be bad, because it's an API
> change, and users might not expect NS_ERRNO.
Not sure about api concern. This is the kernel internal tag.
bpf progs are not aware of them. The functions can change
from sleepable to non-sleepable too.
allow_error_inject can be removed. And so on.
> OTOH, if we had a new kfunc_set, we keep the existing error-inject API
> in place with all the variants and we just teach the verifier that the
> function is non sleepable.
...
> IIUC, the kfunc_set approach would solve that, no?
Makes sense. Let's figure out an extensible kfunc_set approach
that is not centralized in verifier.c
On Tue, Apr 26, 2022 at 12:30 AM Benjamin Tissoires
<[email protected]> wrote:
>
> On Tue, Apr 26, 2022 at 6:09 AM Alexei Starovoitov
> <[email protected]> wrote:
> >
> > On Thu, Apr 21, 2022 at 04:07:35PM +0200, Benjamin Tissoires wrote:
> > > When a kfunc is not returning a pointer to a struct but to a plain type,
> > > check if one of the arguments is called __sz and is a const from the
> > > caller, and use this as the size of the allocated memory.
> > >
> > > For tracing programs, we consider the provided memory to be read only
> > > unless the program is BPF_MODIFY_RETURN.
> > >
> > > Signed-off-by: Benjamin Tissoires <[email protected]>
> > >
> > > ---
> > >
> > > new in v4
> > > ---
> > > include/linux/btf.h | 6 ++++
> > > kernel/bpf/btf.c | 31 ++++++++++++++++----
> > > kernel/bpf/verifier.c | 66 +++++++++++++++++++++++++++++++++----------
> > > 3 files changed, 83 insertions(+), 20 deletions(-)
> > >
> > > diff --git a/include/linux/btf.h b/include/linux/btf.h
> > > index 36bc09b8e890..76a3ff48ae2a 100644
> > > --- a/include/linux/btf.h
> > > +++ b/include/linux/btf.h
> > > @@ -332,6 +332,12 @@ static inline struct btf_param *btf_params(const struct btf_type *t)
> > > return (struct btf_param *)(t + 1);
> > > }
> > >
> > > +struct bpf_reg_state;
> > > +
> > > +bool btf_is_kfunc_arg_mem_size(const struct btf *btf,
> > > + const struct btf_param *arg,
> > > + const struct bpf_reg_state *reg);
> > > +
> > > #ifdef CONFIG_BPF_SYSCALL
> > > struct bpf_prog;
> > >
> > > diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c
> > > index 76318a4c2d0e..22e6e3cdc7ee 100644
> > > --- a/kernel/bpf/btf.c
> > > +++ b/kernel/bpf/btf.c
> > > @@ -5851,9 +5851,9 @@ static bool __btf_type_is_scalar_struct(struct bpf_verifier_log *log,
> > > return true;
> > > }
> > >
> > > -static bool is_kfunc_arg_mem_size(const struct btf *btf,
> > > - const struct btf_param *arg,
> > > - const struct bpf_reg_state *reg)
> > > +bool btf_is_kfunc_arg_mem_size(const struct btf *btf,
> > > + const struct btf_param *arg,
> > > + const struct bpf_reg_state *reg)
> > > {
> > > int len, sfx_len = sizeof("__sz") - 1;
> > > const struct btf_type *t;
> > > @@ -5976,7 +5976,7 @@ static int btf_check_func_arg_match(struct bpf_verifier_env *env,
> > > reg_btf = reg->btf;
> > > reg_ref_id = reg->btf_id;
> > > /* Ensure only one argument is referenced
> > > - * PTR_TO_BTF_ID, check_func_arg_reg_off relies
> > > + * PTR_TO_BTF_ID or PTR_TO_MEM, check_func_arg_reg_off relies
> > > * on only one referenced register being allowed
> > > * for kfuncs.
> > > */
> > > @@ -6012,7 +6012,10 @@ static int btf_check_func_arg_match(struct bpf_verifier_env *env,
> > > u32 type_size;
> > >
> > > if (is_kfunc) {
> > > - bool arg_mem_size = i + 1 < nargs && is_kfunc_arg_mem_size(btf, &args[i + 1], ®s[regno + 1]);
> > > + bool arg_mem_size = i + 1 < nargs &&
> > > + btf_is_kfunc_arg_mem_size(btf,
> > > + &args[i + 1],
> > > + ®s[regno + 1]);
> >
> > bpf allows ~100 chars. No need to break the line so much.
> >
> > >
> > > /* Permit pointer to mem, but only when argument
> > > * type is pointer to scalar, or struct composed
> > > @@ -6039,6 +6042,24 @@ static int btf_check_func_arg_match(struct bpf_verifier_env *env,
> > > i++;
> > > continue;
> > > }
> > > +
> > > + if (rel && reg->ref_obj_id) {
> > > + /* Ensure only one argument is referenced
> > > + * PTR_TO_BTF_ID or PTR_TO_MEM, check_func_arg_reg_off
> > > + * relies on only one referenced register being allowed
> > > + * for kfuncs.
> > > + */
> > > + if (ref_obj_id) {
> > > + bpf_log(log,
> > > + "verifier internal error: more than one arg with ref_obj_id R%d %u %u\n",
> > > + regno,
> > > + reg->ref_obj_id,
> > > + ref_obj_id);
> > > + return -EFAULT;
> > > + }
> > > + ref_regno = regno;
> > > + ref_obj_id = reg->ref_obj_id;
> > > + }
> > > }
> > >
> > > resolve_ret = btf_resolve_size(btf, ref_t, &type_size);
> > > diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> > > index 71827d14724a..0f339f9058f3 100644
> > > --- a/kernel/bpf/verifier.c
> > > +++ b/kernel/bpf/verifier.c
> > > @@ -6974,7 +6974,9 @@ static int check_kfunc_call(struct bpf_verifier_env *env, struct bpf_insn *insn,
> > > int err, insn_idx = *insn_idx_p;
> > > const struct btf_param *args;
> > > struct btf *desc_btf;
> > > + enum bpf_prog_type prog_type = resolve_prog_type(env->prog);
> > > bool acq;
> > > + size_t reg_size = 0;
> > >
> > > /* skip for now, but return error when we find this in fixup_kfunc_call */
> > > if (!insn->imm)
> > > @@ -7015,8 +7017,8 @@ static int check_kfunc_call(struct bpf_verifier_env *env, struct bpf_insn *insn,
> > > }
> > > }
> > >
> > > - for (i = 0; i < CALLER_SAVED_REGS; i++)
> > > - mark_reg_not_init(env, regs, caller_saved[i]);
> > > + /* reset REG_0 */
> > > + mark_reg_not_init(env, regs, BPF_REG_0);
> > >
> > > /* Check return type */
> > > t = btf_type_skip_modifiers(desc_btf, func_proto->type, NULL);
> > > @@ -7026,6 +7028,9 @@ static int check_kfunc_call(struct bpf_verifier_env *env, struct bpf_insn *insn,
> > > return -EINVAL;
> > > }
> > >
> > > + nargs = btf_type_vlen(func_proto);
> > > + args = btf_params(func_proto);
> > > +
> > > if (btf_type_is_scalar(t)) {
> > > mark_reg_unknown(env, regs, BPF_REG_0);
> > > mark_btf_func_reg_size(env, BPF_REG_0, t->size);
> > > @@ -7033,24 +7038,54 @@ static int check_kfunc_call(struct bpf_verifier_env *env, struct bpf_insn *insn,
> > > ptr_type = btf_type_skip_modifiers(desc_btf, t->type,
> > > &ptr_type_id);
> > > if (!btf_type_is_struct(ptr_type)) {
> > > - ptr_type_name = btf_name_by_offset(desc_btf,
> > > - ptr_type->name_off);
> > > - verbose(env, "kernel function %s returns pointer type %s %s is not supported\n",
> > > - func_name, btf_type_str(ptr_type),
> > > - ptr_type_name);
> > > - return -EINVAL;
> > > + /* if we have an array, we must have a const argument named "__sz" */
> > > + for (i = 0; i < nargs; i++) {
> > > + u32 regno = i + BPF_REG_1;
> > > + struct bpf_reg_state *reg = ®s[regno];
> > > +
> > > + /* look for any const scalar parameter of name "__sz" */
> > > + if (!check_reg_arg(env, regno, SRC_OP) &&
> > > + tnum_is_const(regs[regno].var_off) &&
> > > + btf_is_kfunc_arg_mem_size(desc_btf, &args[i], reg))
> > > + reg_size = regs[regno].var_off.value;
> > > + }
> > > +
> > > + if (!reg_size) {
> > > + ptr_type_name = btf_name_by_offset(desc_btf,
> > > + ptr_type->name_off);
> > > + verbose(env,
> > > + "kernel function %s returns pointer type %s %s is not supported\n",
> > > + func_name,
> > > + btf_type_str(ptr_type),
> > > + ptr_type_name);
> > > + return -EINVAL;
> > > + }
> > > +
> > > + mark_reg_known_zero(env, regs, BPF_REG_0);
> > > + regs[BPF_REG_0].type = PTR_TO_MEM;
> > > + regs[BPF_REG_0].mem_size = reg_size;
> > > +
> > > + /* in case of tracing, only allow write access to
> > > + * BPF_MODIFY_RETURN programs
> > > + */
> > > + if (prog_type == BPF_PROG_TYPE_TRACING &&
> > > + env->prog->expected_attach_type != BPF_MODIFY_RETURN)
> > > + regs[BPF_REG_0].type |= MEM_RDONLY;
> >
> > MOD_RET restriction looks artificial.
> > We can distinguish readonly vs writeable PTR_TO_MEM based on
> > another naming convention.
> > Currently arg_name__sz applies to the previous argument.
> > Matching suffix made sense there.
>
> Oh, I missed the point of the "__sz". I did not realize it was
> supposed to be a suffix.
>
> > Reusing the same suffix matching for a different purpose could be confusing.
> > For this use case we may reserve a full argument name.
> > Like "rdonly_buf_size" and "rdwr_buf_size" ?
> >
>
> I like the idea but I have 2 problems here:
> 1. I do not really want to have 2 separate kfuncs for read only and
> write operations
> 2. How can I restrict the write operation to fmod_ret?
>
> For 1, my guess is that the read-only operation will not be used
> unless we solve 2.
> For 2, the rationale is that I think tracing functions are not
> supposed to change the behavior. This was said on the thread about
> priorities for BPF programs. And it somehow makes sense that fentry
> should be used for tracing only. OTOH, fmod_ret is clearly affecting
> the behavior of the program, so I see it more "natural" that it can
> change the context too.
Well, if we say that fentry is rdonly and fmod_ret is rdwr
then we probably shouldn't stop at return value.
If bpf prog can access the argument and this argument is an array
it should be writable.
We can allow different kfuncs for fentry and fmod_ret too.
They can be two tiny wrappers with different arg names (to distinguish
rdonly vs rdwr) on top of the single always_inline function
that returns a buffer.
On Sat, Apr 30, 2022 at 9:12 AM Benjamin Tissoires
<[email protected]> wrote:
>
[...]
> This is roughly what I have now:
>
> - hid-core is not aware of BPF except for a few __weak
> ALLOW_ERROR_INJECTION hooks (dispatch_hid_bpf_device_event for
> example)
> - I have a separate hid-bpf module that attaches BPF traces to these
> hooks and calls a "dispatch" kfunc within hid-bpf
> - the dispatch function then do a succession of BPF calls to programs
> attached to it by using bpf_tail_call(prog_array, hid_id)
>
> - for the clients, they define one or more
> SEC("fmod_ret/hid_bpf_device_event"). That __weak hook is declared in
> the kernel by hid-bpf but is never called, it's just an API
> declaration
> - then clients call in a SEC("syscall")
> hid_bpf_attach_prog(ctx->prog_fd, ctx->hid_id, ctx->flags);
> - hid_bpf_attach_prog is a kfunc that takes a ref on the struct
> bpf_prog*, and stores that program in the correct struct bpf_map *for
> the given attached_btf_id (hid_bpf_device_event in our case)
>
> And that's about it.
> I still need to handle automatic release of the bpf prog when there is
> no userspace open fd on it unless it's pinned but I think this should
> be working fine.
>
> I also probably need to pin some SEC("syscall") (hid_bpf_attach_prog
> and hid_bpf_dettach_prog) so users don't have to write them down and
> can just use the ones provided by the kernel.
>
> The nice thing is that I can define my own API for the attach call
> without dealing with bpf core. I can thus add a priority flag that is
> relevant here because the data coming through the bpf program can be
> modified.
>
> The other thing is that now, I don't care which function we are in to
> decide if a RET_PTR_MEM is read only or not. I can deal with that by
> either playing with the flags or even replacing entirely the dispatch
> trace prog from userspace if I want to access the raw events.
>
> However, the downsides are:
> - I need to also define kfuncs for BPF_PROG_TYPE_SYSCALL (I don't
> think It'll be a big issue)
> - The only way I could store the bpf_prog into the map was to hack
> around the map ops, because the fd of the map in the skel is not
> available while doing a SEC("syscall") from a different process.
Update on this side: I realized that I could use the syscall
BPF_MAP_GET_FD_BY_ID instead to get an fd for the current task.
However, I've been bitten quite hard today because I was using
bpf_map_get() instead of bpf_map_get_with_uref(), and so every time I
closed the fd in the syscall the map was cleared...
But now I would like to have more than one program of a type per hid
device, meaning that I can not have only one bpf_map of type
BPF_MAP_TYPE_PROG_ARRAY.
I have explored BPF_MAP_TYPE_HASH_OF_MAPS, but we can not have
BPF_MAP_TYPE_PROG_ARRAY as inner maps with the current code. And I'd
need 2 levels of nesting (which is not authorized today):
- hid_jmp_table (key: HID id)
- array of different program type per HID device (key: HID_BPF_PROG_TYPE)
- BPF_MAP_TYPE_PROG_ARRAY with the actual progs (key: int)
The other solution would be to be able to create a map when needed,
store it in struct hid_device, and then call bpf_tail_call on this
map. The problem is that I need a way to teach the verifier that the
struct bpf_map pointer I have in the context is a true bpf_map...
Any input would be appreciated :)
Cheers,
Benjamin
>
> Also, I wonder if we should not have some way to namespace kfuncs.
> Ideally, I would like to prevent the usage of those kfuncs outside of
> some helpers that I define in HID so I don't have to worry too much
> about other trace programs fuzzing and segfaulting the kernel.
>
[...]
On Sat, Apr 30, 2022 at 5:30 AM Alexei Starovoitov
<[email protected]> wrote:
>
> On Tue, Apr 26, 2022 at 12:52 AM Benjamin Tissoires
> <[email protected]> wrote:
> >
> > On Tue, Apr 26, 2022 at 6:11 AM Alexei Starovoitov
> > <[email protected]> wrote:
> > >
> > > On Thu, Apr 21, 2022 at 04:07:36PM +0200, Benjamin Tissoires wrote:
> > > > When using error-injection function through bpf to change the return
> > > > code, we need to know if the function is sleepable or not.
> > > >
> > > > Currently the code assumes that all error-inject functions are sleepable,
> > > > except for a few selected of them, hardcoded in kernel/bpf/verifier.c
> > > >
> > > > Add a new flag to error-inject so we can code that information where the
> > > > function is declared.
> > > >
> > > > Signed-off-by: Benjamin Tissoires <[email protected]>
> > > >
> > > > ---
> > > >
> > > > new in v4:
> > > > - another approach would be to define a new kfunc_set, and register
> > > > it with btf. But in that case, what program type would we use?
> > > > BPF_PROG_TYPE_UNSPEC?
> > > > - also note that maybe we should consider all of the functions
> > > > non-sleepable and only mark some as sleepable. IMO it makes more
> > > > sense to be more restrictive by default.
> > >
> > > I think the approach in this patch is fine.
> > > We didn't have issues with check_non_sleepable_error_inject() so far,
> > > so I wouldn't start refactoring it.
> >
> > OK... though I can't help but thinking that adding a new
> > error-inject.h enum value is going to be bad, because it's an API
> > change, and users might not expect NS_ERRNO.
>
> Not sure about api concern. This is the kernel internal tag.
> bpf progs are not aware of them. The functions can change
> from sleepable to non-sleepable too.
> allow_error_inject can be removed. And so on.
>
> > OTOH, if we had a new kfunc_set, we keep the existing error-inject API
> > in place with all the variants and we just teach the verifier that the
> > function is non sleepable.
> ...
> > IIUC, the kfunc_set approach would solve that, no?
>
> Makes sense. Let's figure out an extensible kfunc_set approach
> that is not centralized in verifier.c
>
OK, I'll work on this in v5.
But I need to rethink the whole sleepable/non-sleepable definitions
for my use case, because I have now a clear separation between not
sleepable context (in fentry/fexit/fmod_ret) and sleepable context (in
SEC("syscall")), so maybe the whole thing is not really required.
Cheers,
Benjamin
On Sat, Apr 30, 2022 at 5:01 AM Alexei Starovoitov
<[email protected]> wrote:
>
> On Tue, Apr 26, 2022 at 12:20 AM Benjamin Tissoires
> <[email protected]> wrote:
> >
> > On Tue, Apr 26, 2022 at 6:03 AM Alexei Starovoitov
> > <[email protected]> wrote:
> > >
> > > On Thu, Apr 21, 2022 at 04:07:33PM +0200, Benjamin Tissoires wrote:
> > > > Hi,
> > > >
> > > > so after the reviews from v3, and some discussion with Alexei, I am
> > > > back with a new version of HID-BPF.
> > > >
> > > > This version is not complete (thus the RFC), but I'd like to share
> > > > it now to get initial feedback, in case I am too far from the actual
> > > > goal.
> > > >
> > > > FTR, the goal is to provide some changes in the core verifier/btf so
> > > > that we can plug in HID-BPF independently from BPF core. This way we can
> > > > extend it without having to care about bpf-next.
> > >
> > > Overall looks great. imo much cleaner, simpler and more extensible
> > > than the earlier versions.
> > > The bpf core extensions are nicely contained and HID side can be
> > > worked on in parallel.
> >
> > \o/
> >
> > >
> > > > The things I am not entirely sure are:
> > > > - do we need only fentry/fexit/fmod_ret BPF program types or should
> > > > programs that modify the data stream use a different kind?
> > >
> > > Probably not. I'll reply in patch 2.
> > >
> > > > - patch 3/7 is probably not the correct approach (see comments in the
> > > > patch itself)
> > > >
> > > > We are missing quite a few bits here:
> > > > - selftests for patches 1 to 4
> > > > - add the ability to attach a program to a struct device, and run that
> > > > program only for that struct device
> > >
> > > yes. That is still to be figured out.
> >
> > I spent some time on that, and I don't think it makes a lot of sense
> > to use the current trampoline approach if we want to keep on using
> > fentry/fexit...
> > - the trampoline is pretty nice, but it adds instructions before
> > calling the actual function, meaning that adding a check on struct
> > device will be quite hard to do ()we have no idea where the struct
> > device is in the arguments) and will take more space on the trampoline
> > itself
> > - there is a limit on how many functions can be attached to a
> > trampoline (38 IIRC), and we probably will explode that number quickly
> > enough when we get more BPF programs to support HID devices.
>
> Ohh. This is an obsolete limitation.
> 38 was the number since we used half page optimization
> for bpf trampoline.
> It's gone now. We can easily lift this max.
Oh, interesting.
I have been working this week with that limitation in place, and I
just managed to get something working yesterday night. See below.
>
> > So my chain of thoughts from yesterday was the following (completely
> > untested of course):
> > - instead of writing a new BPF API that might move in the future while
> > things are settling, I can actually simply load a tracer BPF program
> > from HID that monitors the BPF programs that are attached to a given
> > function
> > - I can also add a new API (a kfunc likely) that "registers" a given
> > BPF program (through its fd) to a given HID device
> > - when a device sends data, it hits hid_bpf_device_event() which will
> > have a default BPF program (loaded by the kernel) that dispatches the
> > registered BPF programs based on the HID device.
> >
> > This would solve the 2 issues above IMO, except that the kfunc to
> > register a HID BPF program will suddenly be not standard.
>
> Could you add more details to these ideas?
That's basically your last suggestion below.
> I thought you wanted bpf prog writers to be independent of each other.
> They would tell some framework HID device id/pcie id that they need
> and the rest would be automatic.
> Maybe we can achieve that by adding another layer before libbpf
> that would accept (bpf_prog, hid_id) tuple and insert
> if (hid->id != hid_id) return -E..;
> as the first insn into bpf_prog before loading into the kernel.
> All such progs will be kfunc-s attached to the same hook.
> The kernel will execute them sequentially.
> The framework will provide demux by auto-inserting this 'if'.
That's an interesting approach. Much less convoluted than the one I
have right now. The only downside I see is that every program will be
called for every event, which might not be ideal (though the event
stream from HID is way smaller than network).
> This 'if (hid)' could be a part of sample code too.
> We can simply ask prog writers to follow this style.
Not a big fan of that...
>
> Another idea would be to do something like libxdp.
> Attach a "dispatcher" bpf prog to the kfunc hook and use
> some library to attach hid-specific progs as "freplace" kind of
> programs. It's a more involved solution.
freplace might not work for me because I want to attach more than one
program to a given HID device. Not so sure BPF_PROG_TYPE_REPLACE will
do something enough there.
>
> Another option is to use tail_calls.
> If hid_id is a relatively small number. The "dispatcher" bpf prog
> can do bpf_tail_call(prog_array, hid_id)
> while hid specific progs insert itself into prog_array
> instead of attaching to kfunc.
This is roughly what I have now:
- hid-core is not aware of BPF except for a few __weak
ALLOW_ERROR_INJECTION hooks (dispatch_hid_bpf_device_event for
example)
- I have a separate hid-bpf module that attaches BPF traces to these
hooks and calls a "dispatch" kfunc within hid-bpf
- the dispatch function then do a succession of BPF calls to programs
attached to it by using bpf_tail_call(prog_array, hid_id)
- for the clients, they define one or more
SEC("fmod_ret/hid_bpf_device_event"). That __weak hook is declared in
the kernel by hid-bpf but is never called, it's just an API
declaration
- then clients call in a SEC("syscall")
hid_bpf_attach_prog(ctx->prog_fd, ctx->hid_id, ctx->flags);
- hid_bpf_attach_prog is a kfunc that takes a ref on the struct
bpf_prog*, and stores that program in the correct struct bpf_map *for
the given attached_btf_id (hid_bpf_device_event in our case)
And that's about it.
I still need to handle automatic release of the bpf prog when there is
no userspace open fd on it unless it's pinned but I think this should
be working fine.
I also probably need to pin some SEC("syscall") (hid_bpf_attach_prog
and hid_bpf_dettach_prog) so users don't have to write them down and
can just use the ones provided by the kernel.
The nice thing is that I can define my own API for the attach call
without dealing with bpf core. I can thus add a priority flag that is
relevant here because the data coming through the bpf program can be
modified.
The other thing is that now, I don't care which function we are in to
decide if a RET_PTR_MEM is read only or not. I can deal with that by
either playing with the flags or even replacing entirely the dispatch
trace prog from userspace if I want to access the raw events.
However, the downsides are:
- I need to also define kfuncs for BPF_PROG_TYPE_SYSCALL (I don't
think It'll be a big issue)
- The only way I could store the bpf_prog into the map was to hack
around the map ops, because the fd of the map in the skel is not
available while doing a SEC("syscall") from a different process.
Also, I wonder if we should not have some way to namespace kfuncs.
Ideally, I would like to prevent the usage of those kfuncs outside of
some helpers that I define in HID so I don't have to worry too much
about other trace programs fuzzing and segfaulting the kernel.
>
> > >
> > > > - when running through bpf_prog_test_run_opts, how can we ensure we are
> > > > talking to the correct device? (I have a feeling this is linked to the
> > > > previous point)
> > > > - how can we reconnect the device when a report descriptor fixup BPF
> > > > program is loaded (would it make sense to allow some notifications on
> > > > when a BPF program is attached/detached to a device, and which
> > > > function have been traced?)
> > >
> > > Not sure I follow. What kind of notification do you have in mind?
> > > To user space?
> > >
> >
> > No, this is in-kernel notifications.
> > What I want to do, is when I load a BPF program that changes the HID
> > report descriptor, hid-core detects that and reconnects the attached
> > device.
> >
> > But after a couple of days of thinking, and with the above approach
> > where HID would preload a BPF program, I should be able to achieve
> > that with the "register BPF through a HID kfunc call". When I see that
> > we are attaching a HID report descriptor fixup to a given HID device,
> > I can then reconnect the matching device.
> >
> > It would certainly be cleaner to have a general "notify me when a
> > tracer is attached to this particular function", but we can hide that
> > right now with a preloaded BPF program :)
>
> There are few lsm hooks in bpf core. It probably wwill be eird
> for hid core to hook into them. We can add a few tracepoints
> at attach functions if that helps.
>
With the above, I actually don't need those notifications. Users are
calling my own "syscall" when attaching their BPF prog, and they are
also not required to attach the program to the actual BPF trace
(through BPF_PROG_ATTACH), so this request is now gone :)
Cheers,
Benjamin
On Sat, Apr 30, 2022 at 09:12:09AM +0200, Benjamin Tissoires wrote:
>
> Also, I wonder if we should not have some way to namespace kfuncs.
> Ideally, I would like to prevent the usage of those kfuncs outside of
> some helpers that I define in HID so I don't have to worry too much
> about other trace programs fuzzing and segfaulting the kernel.
That would be a great feature to have. Other folks expressed the same interest.
Just grouping them by prog type is not flexible enough.
It feels kfuncs could be scoped by (prog_type, attach_btf_id or attach_hook) pair.
What are your thoughts?
On Mon, May 02, 2022 at 11:43:51PM +0200, Benjamin Tissoires wrote:
> On Sat, Apr 30, 2022 at 9:12 AM Benjamin Tissoires
> <[email protected]> wrote:
> >
> [...]
> > This is roughly what I have now:
> >
> > - hid-core is not aware of BPF except for a few __weak
> > ALLOW_ERROR_INJECTION hooks (dispatch_hid_bpf_device_event for
> > example)
> > - I have a separate hid-bpf module that attaches BPF traces to these
> > hooks and calls a "dispatch" kfunc within hid-bpf
> > - the dispatch function then do a succession of BPF calls to programs
> > attached to it by using bpf_tail_call(prog_array, hid_id)
> >
> > - for the clients, they define one or more
> > SEC("fmod_ret/hid_bpf_device_event"). That __weak hook is declared in
> > the kernel by hid-bpf but is never called, it's just an API
> > declaration
> > - then clients call in a SEC("syscall")
> > hid_bpf_attach_prog(ctx->prog_fd, ctx->hid_id, ctx->flags);
> > - hid_bpf_attach_prog is a kfunc that takes a ref on the struct
> > bpf_prog*, and stores that program in the correct struct bpf_map *for
> > the given attached_btf_id (hid_bpf_device_event in our case)
> >
> > And that's about it.
> > I still need to handle automatic release of the bpf prog when there is
> > no userspace open fd on it unless it's pinned but I think this should
> > be working fine.
> >
> > I also probably need to pin some SEC("syscall") (hid_bpf_attach_prog
> > and hid_bpf_dettach_prog) so users don't have to write them down and
> > can just use the ones provided by the kernel.
> >
> > The nice thing is that I can define my own API for the attach call
> > without dealing with bpf core. I can thus add a priority flag that is
> > relevant here because the data coming through the bpf program can be
> > modified.
> >
> > The other thing is that now, I don't care which function we are in to
> > decide if a RET_PTR_MEM is read only or not. I can deal with that by
> > either playing with the flags or even replacing entirely the dispatch
> > trace prog from userspace if I want to access the raw events.
> >
> > However, the downsides are:
> > - I need to also define kfuncs for BPF_PROG_TYPE_SYSCALL (I don't
> > think It'll be a big issue)
> > - The only way I could store the bpf_prog into the map was to hack
> > around the map ops, because the fd of the map in the skel is not
> > available while doing a SEC("syscall") from a different process.
>
> Update on this side: I realized that I could use the syscall
> BPF_MAP_GET_FD_BY_ID instead to get an fd for the current task.
> However, I've been bitten quite hard today because I was using
> bpf_map_get() instead of bpf_map_get_with_uref(), and so every time I
> closed the fd in the syscall the map was cleared...
>
> But now I would like to have more than one program of a type per hid
> device, meaning that I can not have only one bpf_map of type
> BPF_MAP_TYPE_PROG_ARRAY.
> I have explored BPF_MAP_TYPE_HASH_OF_MAPS, but we can not have
> BPF_MAP_TYPE_PROG_ARRAY as inner maps with the current code. And I'd
> need 2 levels of nesting (which is not authorized today):
> - hid_jmp_table (key: HID id)
> - array of different program type per HID device (key: HID_BPF_PROG_TYPE)
> - BPF_MAP_TYPE_PROG_ARRAY with the actual progs (key: int)
>
> The other solution would be to be able to create a map when needed,
> store it in struct hid_device, and then call bpf_tail_call on this
> map. The problem is that I need a way to teach the verifier that the
> struct bpf_map pointer I have in the context is a true bpf_map...
We have kptr feature now.
So bpf progs can store pointers to specific kernel data structures
inside map values.
Storing 'struct bpf_map *' in a map value would be something :)
Circular dependency issues to address. Maybe it's doable.
Would hash based prog_array work ?
Then the key can be an arbitrary combination.
There is fd_htab logic. It's used for map-in-map.
We can tweak it to store progs in a hash map.
On Thu, May 12, 2022 at 6:23 AM Alexei Starovoitov
<[email protected]> wrote:
>
> On Sat, Apr 30, 2022 at 09:12:09AM +0200, Benjamin Tissoires wrote:
> >
> > Also, I wonder if we should not have some way to namespace kfuncs.
> > Ideally, I would like to prevent the usage of those kfuncs outside of
> > some helpers that I define in HID so I don't have to worry too much
> > about other trace programs fuzzing and segfaulting the kernel.
>
> That would be a great feature to have. Other folks expressed the same interest.
> Just grouping them by prog type is not flexible enough.
> It feels kfuncs could be scoped by (prog_type, attach_btf_id or attach_hook) pair.
> What are your thoughts?
>
Scoping by attach_btf_id is very appealing to me (attach_hook less TBH):
I have internal functions I do not want normal users to use, and also
it would also restrict who can call what in the more general case.
However, I don't think I'll put that effort in v5. It is a nice to
have feature IMO, but not really required ATM.
Cheers,
Benjamin
On Fri, May 13, 2022 at 10:02 AM Benjamin Tissoires
<[email protected]> wrote:
>
> On Thu, May 12, 2022 at 6:23 AM Alexei Starovoitov
> <[email protected]> wrote:
> >
> > On Sat, Apr 30, 2022 at 09:12:09AM +0200, Benjamin Tissoires wrote:
> > >
> > > Also, I wonder if we should not have some way to namespace kfuncs.
> > > Ideally, I would like to prevent the usage of those kfuncs outside of
> > > some helpers that I define in HID so I don't have to worry too much
> > > about other trace programs fuzzing and segfaulting the kernel.
> >
> > That would be a great feature to have. Other folks expressed the same interest.
> > Just grouping them by prog type is not flexible enough.
> > It feels kfuncs could be scoped by (prog_type, attach_btf_id or attach_hook) pair.
> > What are your thoughts?
> >
>
> Scoping by attach_btf_id is very appealing to me (attach_hook less TBH):
> I have internal functions I do not want normal users to use, and also
> it would also restrict who can call what in the more general case.
>
> However, I don't think I'll put that effort in v5. It is a nice to
> have feature IMO, but not really required ATM.
Great. Looking forward to it.
On Thu, May 12, 2022 at 6:16 AM Alexei Starovoitov
<[email protected]> wrote:
>
> On Mon, May 02, 2022 at 11:43:51PM +0200, Benjamin Tissoires wrote:
> > On Sat, Apr 30, 2022 at 9:12 AM Benjamin Tissoires
> > <[email protected]> wrote:
> > >
> > [...]
> > > This is roughly what I have now:
> > >
> > > - hid-core is not aware of BPF except for a few __weak
> > > ALLOW_ERROR_INJECTION hooks (dispatch_hid_bpf_device_event for
> > > example)
> > > - I have a separate hid-bpf module that attaches BPF traces to these
> > > hooks and calls a "dispatch" kfunc within hid-bpf
> > > - the dispatch function then do a succession of BPF calls to programs
> > > attached to it by using bpf_tail_call(prog_array, hid_id)
> > >
> > > - for the clients, they define one or more
> > > SEC("fmod_ret/hid_bpf_device_event"). That __weak hook is declared in
> > > the kernel by hid-bpf but is never called, it's just an API
> > > declaration
> > > - then clients call in a SEC("syscall")
> > > hid_bpf_attach_prog(ctx->prog_fd, ctx->hid_id, ctx->flags);
> > > - hid_bpf_attach_prog is a kfunc that takes a ref on the struct
> > > bpf_prog*, and stores that program in the correct struct bpf_map *for
> > > the given attached_btf_id (hid_bpf_device_event in our case)
> > >
> > > And that's about it.
> > > I still need to handle automatic release of the bpf prog when there is
> > > no userspace open fd on it unless it's pinned but I think this should
> > > be working fine.
> > >
> > > I also probably need to pin some SEC("syscall") (hid_bpf_attach_prog
> > > and hid_bpf_dettach_prog) so users don't have to write them down and
> > > can just use the ones provided by the kernel.
> > >
> > > The nice thing is that I can define my own API for the attach call
> > > without dealing with bpf core. I can thus add a priority flag that is
> > > relevant here because the data coming through the bpf program can be
> > > modified.
> > >
> > > The other thing is that now, I don't care which function we are in to
> > > decide if a RET_PTR_MEM is read only or not. I can deal with that by
> > > either playing with the flags or even replacing entirely the dispatch
> > > trace prog from userspace if I want to access the raw events.
> > >
> > > However, the downsides are:
> > > - I need to also define kfuncs for BPF_PROG_TYPE_SYSCALL (I don't
> > > think It'll be a big issue)
> > > - The only way I could store the bpf_prog into the map was to hack
> > > around the map ops, because the fd of the map in the skel is not
> > > available while doing a SEC("syscall") from a different process.
> >
> > Update on this side: I realized that I could use the syscall
> > BPF_MAP_GET_FD_BY_ID instead to get an fd for the current task.
> > However, I've been bitten quite hard today because I was using
> > bpf_map_get() instead of bpf_map_get_with_uref(), and so every time I
> > closed the fd in the syscall the map was cleared...
> >
> > But now I would like to have more than one program of a type per hid
> > device, meaning that I can not have only one bpf_map of type
> > BPF_MAP_TYPE_PROG_ARRAY.
> > I have explored BPF_MAP_TYPE_HASH_OF_MAPS, but we can not have
> > BPF_MAP_TYPE_PROG_ARRAY as inner maps with the current code. And I'd
> > need 2 levels of nesting (which is not authorized today):
> > - hid_jmp_table (key: HID id)
> > - array of different program type per HID device (key: HID_BPF_PROG_TYPE)
> > - BPF_MAP_TYPE_PROG_ARRAY with the actual progs (key: int)
> >
> > The other solution would be to be able to create a map when needed,
> > store it in struct hid_device, and then call bpf_tail_call on this
> > map. The problem is that I need a way to teach the verifier that the
> > struct bpf_map pointer I have in the context is a true bpf_map...
>
> We have kptr feature now.
> So bpf progs can store pointers to specific kernel data structures
> inside map values.
> Storing 'struct bpf_map *' in a map value would be something :)
> Circular dependency issues to address. Maybe it's doable.
>
> Would hash based prog_array work ?
> Then the key can be an arbitrary combination.
> There is fd_htab logic. It's used for map-in-map.
> We can tweak it to store progs in a hash map.
>
In the end, I am just using a global prog_array map, and handling the
indexes myself. It is probably not the cleaner and the most reusable,
but it allows me at least to move forward.
FWIW, I should be able to send v5 next week. I am almost done
reimplementing everything I had in v3, and I am now fighting with
hid.ko as a module (should be solved soon enough).
Cheers,
Benjamin