[Putting this as a RFC because I'm pretty sure I'm not doing the things
correctly at the BPF level.]
[Also using bpf-next as the base tree as there will be conflicting
changes otherwise]
Ideally I'd like to have something similar to bpf_timers, but not
in soft IRQ context. So I'm emulating this with a sleepable
bpf_tail_call() (see "HID: bpf: allow to defer work in a delayed
workqueue").
Basically, I need to be able to defer a HID-BPF program for the
following reasons (from the aforementioned patch):
1. defer an event:
Sometimes we receive an out of proximity event, but the device can not
be trusted enough, and we need to ensure that we won't receive another
one in the following n milliseconds. So we need to wait those n
milliseconds, and eventually re-inject that event in the stack.
2. inject new events in reaction to one given event:
We might want to transform one given event into several. This is the
case for macro keys where a single key press is supposed to send
a sequence of key presses. But this could also be used to patch a
faulty behavior, if a device forgets to send a release event.
3. communicate with the device in reaction to one event:
We might want to communicate back to the device after a given event.
For example a device might send us an event saying that it came back
from sleeping state and needs to be re-initialized.
Currently we can achieve that by keeping a userspace program around,
raise a bpf event, and let that userspace program inject the events and
commands.
However, we are just keeping that program alive as a daemon for just
scheduling commands. There is no logic in it, so it doesn't really justify
an actual userspace wakeup. So a kernel workqueue seems simpler to handle.
Besides the "this is insane to reimplement the wheel", the other part I'm
not sure is whether we can say that BPF maps of type prog_array and of
type queue/stack can be used in sleepable context.
I don't see any warning when running the test programs, but that's probably
not a guarantee I'm doing the things properly :)
Cheers,
Benjamin
Signed-off-by: Benjamin Tissoires <[email protected]>
---
Benjamin Tissoires (9):
bpf: allow more maps in sleepable bpf programs
HID: bpf/dispatch: regroup kfuncs definitions
HID: bpf: export hid_hw_output_report as a BPF kfunc
selftests/hid: Add test for hid_bpf_hw_output_report
HID: bpf: allow to inject HID event from BPF
selftests/hid: add tests for hid_bpf_input_report
HID: bpf: allow to defer work in a delayed workqueue
selftests/hid: add test for hid_bpf_schedule_delayed_work
selftests/hid: add another set of delayed work tests
Documentation/hid/hid-bpf.rst | 26 +-
drivers/hid/bpf/entrypoints/entrypoints.bpf.c | 8 +
drivers/hid/bpf/entrypoints/entrypoints.lskel.h | 236 +++++++++------
drivers/hid/bpf/hid_bpf_dispatch.c | 315 ++++++++++++++++-----
drivers/hid/bpf/hid_bpf_jmp_table.c | 34 ++-
drivers/hid/hid-core.c | 2 +
include/linux/hid_bpf.h | 10 +
kernel/bpf/verifier.c | 3 +
tools/testing/selftests/hid/hid_bpf.c | 286 ++++++++++++++++++-
tools/testing/selftests/hid/progs/hid.c | 205 ++++++++++++++
.../testing/selftests/hid/progs/hid_bpf_helpers.h | 8 +
11 files changed, 962 insertions(+), 171 deletions(-)
---
base-commit: 4f7a05917237b006ceae760507b3d15305769ade
change-id: 20240205-hid-bpf-sleepable-c01260fd91c4
Best regards,
--
Benjamin Tissoires <[email protected]>
We currently only export hid_hw_raw_request() as a BPF kfunc.
However, some devices require an explicit write on the Output Report
instead of the use of the control channel.
So also export hid_hw_output_report to BPF
Signed-off-by: Benjamin Tissoires <[email protected]>
---
Documentation/hid/hid-bpf.rst | 2 +-
drivers/hid/bpf/hid_bpf_dispatch.c | 112 +++++++++++++++++++++++++++----------
drivers/hid/hid-core.c | 1 +
include/linux/hid_bpf.h | 1 +
4 files changed, 86 insertions(+), 30 deletions(-)
diff --git a/Documentation/hid/hid-bpf.rst b/Documentation/hid/hid-bpf.rst
index 4fad83a6ebc3..a575004d9025 100644
--- a/Documentation/hid/hid-bpf.rst
+++ b/Documentation/hid/hid-bpf.rst
@@ -179,7 +179,7 @@ Available API that can be used in syscall HID-BPF programs:
-----------------------------------------------------------
.. kernel-doc:: drivers/hid/bpf/hid_bpf_dispatch.c
- :functions: hid_bpf_attach_prog hid_bpf_hw_request hid_bpf_allocate_context hid_bpf_release_context
+ :functions: hid_bpf_attach_prog hid_bpf_hw_request hid_bpf_hw_output_report hid_bpf_allocate_context hid_bpf_release_context
General overview of a HID-BPF program
=====================================
diff --git a/drivers/hid/bpf/hid_bpf_dispatch.c b/drivers/hid/bpf/hid_bpf_dispatch.c
index 52abb27426f4..a5b88b491b80 100644
--- a/drivers/hid/bpf/hid_bpf_dispatch.c
+++ b/drivers/hid/bpf/hid_bpf_dispatch.c
@@ -376,6 +376,46 @@ hid_bpf_release_context(struct hid_bpf_ctx *ctx)
put_device(&hid->dev);
}
+static int
+__hid_bpf_hw_check_params(struct hid_bpf_ctx *ctx, __u8 *buf, size_t *buf__sz,
+ enum hid_report_type rtype)
+{
+ struct hid_report_enum *report_enum;
+ struct hid_report *report;
+ struct hid_device *hdev;
+ u32 report_len;
+
+ /* check arguments */
+ if (!ctx || !hid_bpf_ops || !buf)
+ return -EINVAL;
+
+ switch (rtype) {
+ case HID_INPUT_REPORT:
+ case HID_OUTPUT_REPORT:
+ case HID_FEATURE_REPORT:
+ break;
+ default:
+ return -EINVAL;
+ }
+
+ if (*buf__sz < 1)
+ return -EINVAL;
+
+ hdev = (struct hid_device *)ctx->hid; /* discard const */
+
+ report_enum = hdev->report_enum + rtype;
+ report = hid_bpf_ops->hid_get_report(report_enum, buf);
+ if (!report)
+ return -EINVAL;
+
+ report_len = hid_report_len(report);
+
+ if (*buf__sz > report_len)
+ *buf__sz = report_len;
+
+ return 0;
+}
+
/**
* hid_bpf_hw_request - Communicate with a HID device
*
@@ -392,24 +432,14 @@ hid_bpf_hw_request(struct hid_bpf_ctx *ctx, __u8 *buf, size_t buf__sz,
enum hid_report_type rtype, enum hid_class_request reqtype)
{
struct hid_device *hdev;
- struct hid_report *report;
- struct hid_report_enum *report_enum;
+ size_t size = buf__sz;
u8 *dma_data;
- u32 report_len;
int ret;
/* check arguments */
- if (!ctx || !hid_bpf_ops || !buf)
- return -EINVAL;
-
- switch (rtype) {
- case HID_INPUT_REPORT:
- case HID_OUTPUT_REPORT:
- case HID_FEATURE_REPORT:
- break;
- default:
- return -EINVAL;
- }
+ ret = __hid_bpf_hw_check_params(ctx, buf, &size, rtype);
+ if (ret)
+ return ret;
switch (reqtype) {
case HID_REQ_GET_REPORT:
@@ -423,29 +453,16 @@ hid_bpf_hw_request(struct hid_bpf_ctx *ctx, __u8 *buf, size_t buf__sz,
return -EINVAL;
}
- if (buf__sz < 1)
- return -EINVAL;
-
hdev = (struct hid_device *)ctx->hid; /* discard const */
- report_enum = hdev->report_enum + rtype;
- report = hid_bpf_ops->hid_get_report(report_enum, buf);
- if (!report)
- return -EINVAL;
-
- report_len = hid_report_len(report);
-
- if (buf__sz > report_len)
- buf__sz = report_len;
-
- dma_data = kmemdup(buf, buf__sz, GFP_KERNEL);
+ dma_data = kmemdup(buf, size, GFP_KERNEL);
if (!dma_data)
return -ENOMEM;
ret = hid_bpf_ops->hid_hw_raw_request(hdev,
dma_data[0],
dma_data,
- buf__sz,
+ size,
rtype,
reqtype);
@@ -455,6 +472,42 @@ hid_bpf_hw_request(struct hid_bpf_ctx *ctx, __u8 *buf, size_t buf__sz,
kfree(dma_data);
return ret;
}
+
+/**
+ * hid_bpf_hw_output_report - Send an output report to a HID device
+ *
+ * @ctx: the HID-BPF context previously allocated in hid_bpf_allocate_context()
+ * @buf: a %PTR_TO_MEM buffer
+ * @buf__sz: the size of the data to transfer
+ *
+ * @returns the number of bytes transferred on success, a negative error code otherwise.
+ */
+__bpf_kfunc int
+hid_bpf_hw_output_report(struct hid_bpf_ctx *ctx, __u8 *buf, size_t buf__sz)
+{
+ struct hid_device *hdev;
+ size_t size = buf__sz;
+ u8 *dma_data;
+ int ret;
+
+ /* check arguments */
+ ret = __hid_bpf_hw_check_params(ctx, buf, &size, HID_OUTPUT_REPORT);
+ if (ret)
+ return ret;
+
+ hdev = (struct hid_device *)ctx->hid; /* discard const */
+
+ dma_data = kmemdup(buf, size, GFP_KERNEL);
+ if (!dma_data)
+ return -ENOMEM;
+
+ ret = hid_bpf_ops->hid_hw_output_report(hdev,
+ dma_data,
+ size);
+
+ kfree(dma_data);
+ return ret;
+}
__bpf_kfunc_end_defs();
/*
@@ -488,6 +541,7 @@ BTF_ID_FLAGS(func, hid_bpf_attach_prog)
BTF_ID_FLAGS(func, hid_bpf_allocate_context, KF_ACQUIRE | KF_RET_NULL)
BTF_ID_FLAGS(func, hid_bpf_release_context, KF_RELEASE)
BTF_ID_FLAGS(func, hid_bpf_hw_request)
+BTF_ID_FLAGS(func, hid_bpf_hw_output_report)
BTF_KFUNCS_END(hid_bpf_syscall_kfunc_ids)
static const struct btf_kfunc_id_set hid_bpf_syscall_kfunc_set = {
diff --git a/drivers/hid/hid-core.c b/drivers/hid/hid-core.c
index de7a477d6665..1243595890ba 100644
--- a/drivers/hid/hid-core.c
+++ b/drivers/hid/hid-core.c
@@ -2974,6 +2974,7 @@ EXPORT_SYMBOL_GPL(hid_check_keys_pressed);
static struct hid_bpf_ops hid_ops = {
.hid_get_report = hid_get_report,
.hid_hw_raw_request = hid_hw_raw_request,
+ .hid_hw_output_report = hid_hw_output_report,
.owner = THIS_MODULE,
.bus_type = &hid_bus_type,
};
diff --git a/include/linux/hid_bpf.h b/include/linux/hid_bpf.h
index 7118ac28d468..5c7ff93dc73e 100644
--- a/include/linux/hid_bpf.h
+++ b/include/linux/hid_bpf.h
@@ -103,6 +103,7 @@ struct hid_bpf_ops {
unsigned char reportnum, __u8 *buf,
size_t len, enum hid_report_type rtype,
enum hid_class_request reqtype);
+ int (*hid_hw_output_report)(struct hid_device *hdev, __u8 *buf, size_t len);
struct module *owner;
const struct bus_type *bus_type;
};
--
2.43.0
This time we need to ensure uhid receives it, thus the new mutex and
condition.
Signed-off-by: Benjamin Tissoires <[email protected]>
---
tools/testing/selftests/hid/hid_bpf.c | 63 ++++++++++++++++++++++
tools/testing/selftests/hid/progs/hid.c | 24 +++++++++
.../testing/selftests/hid/progs/hid_bpf_helpers.h | 2 +
3 files changed, 89 insertions(+)
diff --git a/tools/testing/selftests/hid/hid_bpf.c b/tools/testing/selftests/hid/hid_bpf.c
index 2cf96f818f25..a6d72c8ae7cd 100644
--- a/tools/testing/selftests/hid/hid_bpf.c
+++ b/tools/testing/selftests/hid/hid_bpf.c
@@ -16,6 +16,11 @@
#define SHOW_UHID_DEBUG 0
+#define min(a,b) \
+ ({ __typeof__ (a) _a = (a); \
+ __typeof__ (b) _b = (b); \
+ _a < _b ? _a : _b; })
+
static unsigned char rdesc[] = {
0x06, 0x00, 0xff, /* Usage Page (Vendor Defined Page 1) */
0x09, 0x21, /* Usage (Vendor Usage 0x21) */
@@ -111,6 +116,10 @@ struct hid_hw_request_syscall_args {
static pthread_mutex_t uhid_started_mtx = PTHREAD_MUTEX_INITIALIZER;
static pthread_cond_t uhid_started = PTHREAD_COND_INITIALIZER;
+static pthread_mutex_t uhid_output_mtx = PTHREAD_MUTEX_INITIALIZER;
+static pthread_cond_t uhid_output_cond = PTHREAD_COND_INITIALIZER;
+static unsigned char output_report[10];
+
/* no need to protect uhid_stopped, only one thread accesses it */
static bool uhid_stopped;
@@ -205,6 +214,13 @@ static int uhid_event(struct __test_metadata *_metadata, int fd)
break;
case UHID_OUTPUT:
UHID_LOG("UHID_OUTPUT from uhid-dev");
+
+ pthread_mutex_lock(&uhid_output_mtx);
+ memcpy(output_report,
+ ev.u.output.data,
+ min(ev.u.output.size, sizeof(output_report)));
+ pthread_cond_signal(&uhid_output_cond);
+ pthread_mutex_unlock(&uhid_output_mtx);
break;
case UHID_GET_REPORT:
UHID_LOG("UHID_GET_REPORT from uhid-dev");
@@ -733,6 +749,53 @@ TEST_F(hid_bpf, test_hid_change_report)
ASSERT_EQ(buf[2], 0) TH_LOG("leftovers_from_previous_test");
}
+/*
+ * Call hid_bpf_hw_output_report against the given uhid device,
+ * check that the program is called and does the expected.
+ */
+TEST_F(hid_bpf, test_hid_user_output_report_call)
+{
+ struct hid_hw_request_syscall_args args = {
+ .retval = -1,
+ .size = 10,
+ };
+ DECLARE_LIBBPF_OPTS(bpf_test_run_opts, tattrs,
+ .ctx_in = &args,
+ .ctx_size_in = sizeof(args),
+ );
+ int err, cond_err, prog_fd;
+ struct timespec time_to_wait;
+
+ LOAD_BPF;
+
+ args.hid = self->hid_id;
+ args.data[0] = 1; /* report ID */
+ args.data[1] = 2; /* report ID */
+ args.data[2] = 42; /* report ID */
+
+ prog_fd = bpf_program__fd(self->skel->progs.hid_user_output_report);
+
+ pthread_mutex_lock(&uhid_output_mtx);
+
+ memset(output_report, 0, sizeof(output_report));
+ clock_gettime(CLOCK_REALTIME, &time_to_wait);
+ time_to_wait.tv_sec += 2;
+
+ err = bpf_prog_test_run_opts(prog_fd, &tattrs);
+ cond_err = pthread_cond_timedwait(&uhid_output_cond, &uhid_output_mtx, &time_to_wait);
+
+ ASSERT_OK(err) TH_LOG("error while calling bpf_prog_test_run_opts");
+ ASSERT_OK(cond_err) TH_LOG("error while calling waiting for the condition");
+
+ ASSERT_EQ(args.retval, 3);
+
+ ASSERT_EQ(output_report[0], 1);
+ ASSERT_EQ(output_report[1], 2);
+ ASSERT_EQ(output_report[2], 42);
+
+ pthread_mutex_unlock(&uhid_output_mtx);
+}
+
/*
* Attach hid_user_raw_request to the given uhid device,
* call the bpf program from userspace
diff --git a/tools/testing/selftests/hid/progs/hid.c b/tools/testing/selftests/hid/progs/hid.c
index 1e558826b809..2c2b679a83b1 100644
--- a/tools/testing/selftests/hid/progs/hid.c
+++ b/tools/testing/selftests/hid/progs/hid.c
@@ -101,6 +101,30 @@ int hid_user_raw_request(struct hid_hw_request_syscall_args *args)
return 0;
}
+SEC("syscall")
+int hid_user_output_report(struct hid_hw_request_syscall_args *args)
+{
+ struct hid_bpf_ctx *ctx;
+ const size_t size = args->size;
+ int i, ret = 0;
+
+ if (size > sizeof(args->data))
+ return -7; /* -E2BIG */
+
+ ctx = hid_bpf_allocate_context(args->hid);
+ if (!ctx)
+ return -1; /* EPERM check */
+
+ ret = hid_bpf_hw_output_report(ctx,
+ args->data,
+ size);
+ args->retval = ret;
+
+ hid_bpf_release_context(ctx);
+
+ return 0;
+}
+
static const __u8 rdesc[] = {
0x05, 0x01, /* USAGE_PAGE (Generic Desktop) */
0x09, 0x32, /* USAGE (Z) */
diff --git a/tools/testing/selftests/hid/progs/hid_bpf_helpers.h b/tools/testing/selftests/hid/progs/hid_bpf_helpers.h
index 65e657ac1198..50c6a0d5765e 100644
--- a/tools/testing/selftests/hid/progs/hid_bpf_helpers.h
+++ b/tools/testing/selftests/hid/progs/hid_bpf_helpers.h
@@ -94,5 +94,7 @@ extern int hid_bpf_hw_request(struct hid_bpf_ctx *ctx,
size_t buf__sz,
enum hid_report_type type,
enum hid_class_request reqtype) __ksym;
+extern int hid_bpf_hw_output_report(struct hid_bpf_ctx *ctx,
+ __u8 *buf, size_t buf__sz) __ksym;
#endif /* __HID_BPF_HELPERS_H */
--
2.43.0
These 3 maps types are required for HID-BPF when a user wants to do
IO with a device from a sleepable tracing point.
HID-BPF relies on `bpf_tail_call()` to selectively call the BPF programs
attached to a device, thus we need BPF_MAP_TYPE_PROG_ARRAY access.
Allowing BPF_MAP_TYPE_QUEUE (and therefore BPF_MAP_TYPE_STACK) allows
for a BPF program to prepare from an IRQ the list of HID commands to send
back to the device and then these commands can be retrieved from the
sleepable trace point.
Signed-off-by: Benjamin Tissoires <[email protected]>
---
kernel/bpf/verifier.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 64fa188d00ad..3cc880ecd3e4 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -18028,6 +18028,9 @@ static int check_map_prog_compatibility(struct bpf_verifier_env *env,
case BPF_MAP_TYPE_SK_STORAGE:
case BPF_MAP_TYPE_TASK_STORAGE:
case BPF_MAP_TYPE_CGRP_STORAGE:
+ case BPF_MAP_TYPE_PROG_ARRAY:
+ case BPF_MAP_TYPE_QUEUE:
+ case BPF_MAP_TYPE_STACK:
break;
default:
verbose(env,
--
2.43.0
Usual way of testing, we call the function and ensures we receive
the event
Signed-off-by: Benjamin Tissoires <[email protected]>
---
tools/testing/selftests/hid/hid_bpf.c | 49 +++++++++++++++++++++-
tools/testing/selftests/hid/progs/hid.c | 22 ++++++++++
.../testing/selftests/hid/progs/hid_bpf_helpers.h | 4 ++
3 files changed, 73 insertions(+), 2 deletions(-)
diff --git a/tools/testing/selftests/hid/hid_bpf.c b/tools/testing/selftests/hid/hid_bpf.c
index a6d72c8ae7cd..9ff02c737144 100644
--- a/tools/testing/selftests/hid/hid_bpf.c
+++ b/tools/testing/selftests/hid/hid_bpf.c
@@ -749,6 +749,52 @@ TEST_F(hid_bpf, test_hid_change_report)
ASSERT_EQ(buf[2], 0) TH_LOG("leftovers_from_previous_test");
}
+/*
+ * Call hid_bpf_input_report against the given uhid device,
+ * check that the program is called and does the expected.
+ */
+TEST_F(hid_bpf, test_hid_user_input_report_call)
+{
+ struct hid_hw_request_syscall_args args = {
+ .retval = -1,
+ .size = 10,
+ };
+ DECLARE_LIBBPF_OPTS(bpf_test_run_opts, tattrs,
+ .ctx_in = &args,
+ .ctx_size_in = sizeof(args),
+ );
+ __u8 buf[10] = {0};
+ int err, prog_fd;
+
+ LOAD_BPF;
+
+ args.hid = self->hid_id;
+ args.data[0] = 1; /* report ID */
+ args.data[1] = 2; /* report ID */
+ args.data[2] = 42; /* report ID */
+
+ prog_fd = bpf_program__fd(self->skel->progs.hid_user_input_report);
+
+ /* check that there is no data to read from hidraw */
+ memset(buf, 0, sizeof(buf));
+ err = read(self->hidraw_fd, buf, sizeof(buf));
+ ASSERT_EQ(err, -1) TH_LOG("read_hidraw");
+
+ err = bpf_prog_test_run_opts(prog_fd, &tattrs);
+
+ ASSERT_OK(err) TH_LOG("error while calling bpf_prog_test_run_opts");
+
+ ASSERT_EQ(args.retval, 0);
+
+ /* read the data from hidraw */
+ memset(buf, 0, sizeof(buf));
+ err = read(self->hidraw_fd, buf, sizeof(buf));
+ ASSERT_EQ(err, 6) TH_LOG("read_hidraw");
+ ASSERT_EQ(buf[0], 1);
+ ASSERT_EQ(buf[1], 2);
+ ASSERT_EQ(buf[2], 42);
+}
+
/*
* Call hid_bpf_hw_output_report against the given uhid device,
* check that the program is called and does the expected.
@@ -797,8 +843,7 @@ TEST_F(hid_bpf, test_hid_user_output_report_call)
}
/*
- * Attach hid_user_raw_request to the given uhid device,
- * call the bpf program from userspace
+ * Call hid_hw_raw_request against the given uhid device,
* check that the program is called and does the expected.
*/
TEST_F(hid_bpf, test_hid_user_raw_request_call)
diff --git a/tools/testing/selftests/hid/progs/hid.c b/tools/testing/selftests/hid/progs/hid.c
index 2c2b679a83b1..f67d35def142 100644
--- a/tools/testing/selftests/hid/progs/hid.c
+++ b/tools/testing/selftests/hid/progs/hid.c
@@ -125,6 +125,28 @@ int hid_user_output_report(struct hid_hw_request_syscall_args *args)
return 0;
}
+SEC("syscall")
+int hid_user_input_report(struct hid_hw_request_syscall_args *args)
+{
+ struct hid_bpf_ctx *ctx;
+ const size_t size = args->size;
+ int i, ret = 0;
+
+ if (size > sizeof(args->data))
+ return -7; /* -E2BIG */
+
+ ctx = hid_bpf_allocate_context(args->hid);
+ if (!ctx)
+ return -1; /* EPERM check */
+
+ ret = hid_bpf_input_report(ctx, HID_INPUT_REPORT, args->data, size);
+ args->retval = ret;
+
+ hid_bpf_release_context(ctx);
+
+ return 0;
+}
+
static const __u8 rdesc[] = {
0x05, 0x01, /* USAGE_PAGE (Generic Desktop) */
0x09, 0x32, /* USAGE (Z) */
diff --git a/tools/testing/selftests/hid/progs/hid_bpf_helpers.h b/tools/testing/selftests/hid/progs/hid_bpf_helpers.h
index 50c6a0d5765e..9cd56821d0f1 100644
--- a/tools/testing/selftests/hid/progs/hid_bpf_helpers.h
+++ b/tools/testing/selftests/hid/progs/hid_bpf_helpers.h
@@ -96,5 +96,9 @@ extern int hid_bpf_hw_request(struct hid_bpf_ctx *ctx,
enum hid_class_request reqtype) __ksym;
extern int hid_bpf_hw_output_report(struct hid_bpf_ctx *ctx,
__u8 *buf, size_t buf__sz) __ksym;
+extern int hid_bpf_input_report(struct hid_bpf_ctx *ctx,
+ enum hid_report_type type,
+ __u8 *data,
+ size_t buf__sz) __ksym;
#endif /* __HID_BPF_HELPERS_H */
--
2.43.0
No code change, just move down the hid_bpf_get_data() kfunc definition
so we have only one block of __bpf_kfunc_start/end_defs()
Signed-off-by: Benjamin Tissoires <[email protected]>
---
drivers/hid/bpf/hid_bpf_dispatch.c | 80 ++++++++++++++++++--------------------
1 file changed, 38 insertions(+), 42 deletions(-)
diff --git a/drivers/hid/bpf/hid_bpf_dispatch.c b/drivers/hid/bpf/hid_bpf_dispatch.c
index e630caf644e8..52abb27426f4 100644
--- a/drivers/hid/bpf/hid_bpf_dispatch.c
+++ b/drivers/hid/bpf/hid_bpf_dispatch.c
@@ -143,48 +143,6 @@ u8 *call_hid_bpf_rdesc_fixup(struct hid_device *hdev, u8 *rdesc, unsigned int *s
}
EXPORT_SYMBOL_GPL(call_hid_bpf_rdesc_fixup);
-/* Disables missing prototype warnings */
-__bpf_kfunc_start_defs();
-
-/**
- * hid_bpf_get_data - Get the kernel memory pointer associated with the context @ctx
- *
- * @ctx: The HID-BPF context
- * @offset: The offset within the memory
- * @rdwr_buf_size: the const size of the buffer
- *
- * @returns %NULL on error, an %__u8 memory pointer on success
- */
-__bpf_kfunc __u8 *
-hid_bpf_get_data(struct hid_bpf_ctx *ctx, unsigned int offset, const size_t rdwr_buf_size)
-{
- struct hid_bpf_ctx_kern *ctx_kern;
-
- if (!ctx)
- return NULL;
-
- ctx_kern = container_of(ctx, struct hid_bpf_ctx_kern, ctx);
-
- if (rdwr_buf_size + offset > ctx->allocated_size)
- return NULL;
-
- return ctx_kern->data + offset;
-}
-__bpf_kfunc_end_defs();
-
-/*
- * The following set contains all functions we agree BPF programs
- * can use.
- */
-BTF_KFUNCS_START(hid_bpf_kfunc_ids)
-BTF_ID_FLAGS(func, hid_bpf_get_data, KF_RET_NULL)
-BTF_KFUNCS_END(hid_bpf_kfunc_ids)
-
-static const struct btf_kfunc_id_set hid_bpf_kfunc_set = {
- .owner = THIS_MODULE,
- .set = &hid_bpf_kfunc_ids,
-};
-
static int device_match_id(struct device *dev, const void *id)
{
struct hid_device *hdev = to_hid_device(dev);
@@ -281,6 +239,31 @@ static int do_hid_bpf_attach_prog(struct hid_device *hdev, int prog_fd, struct b
/* Disables missing prototype warnings */
__bpf_kfunc_start_defs();
+/**
+ * hid_bpf_get_data - Get the kernel memory pointer associated with the context @ctx
+ *
+ * @ctx: The HID-BPF context
+ * @offset: The offset within the memory
+ * @rdwr_buf_size: the const size of the buffer
+ *
+ * @returns %NULL on error, an %__u8 memory pointer on success
+ */
+__bpf_kfunc __u8 *
+hid_bpf_get_data(struct hid_bpf_ctx *ctx, unsigned int offset, const size_t rdwr_buf_size)
+{
+ struct hid_bpf_ctx_kern *ctx_kern;
+
+ if (!ctx)
+ return NULL;
+
+ ctx_kern = container_of(ctx, struct hid_bpf_ctx_kern, ctx);
+
+ if (rdwr_buf_size + offset > ctx->allocated_size)
+ return NULL;
+
+ return ctx_kern->data + offset;
+}
+
/**
* hid_bpf_attach_prog - Attach the given @prog_fd to the given HID device
*
@@ -474,6 +457,19 @@ hid_bpf_hw_request(struct hid_bpf_ctx *ctx, __u8 *buf, size_t buf__sz,
}
__bpf_kfunc_end_defs();
+/*
+ * The following set contains all functions we agree BPF programs
+ * can use.
+ */
+BTF_KFUNCS_START(hid_bpf_kfunc_ids)
+BTF_ID_FLAGS(func, hid_bpf_get_data, KF_RET_NULL)
+BTF_KFUNCS_END(hid_bpf_kfunc_ids)
+
+static const struct btf_kfunc_id_set hid_bpf_kfunc_set = {
+ .owner = THIS_MODULE,
+ .set = &hid_bpf_kfunc_ids,
+};
+
/* our HID-BPF entrypoints */
BTF_SET8_START(hid_bpf_fmodret_ids)
BTF_ID_FLAGS(func, hid_bpf_device_event)
--
2.43.0
This test checks that we can actually execute in a sleepable context
the job called by hid_bpf_offload().
When an event is injected, we push it on a map of type queue and schedule
a work.
When that work kicks in, it pulls the event from the queue, and wakes
up userspace through a ring buffer.
The use of the ring buffer is there to not have sleeps in userspace
because we have no guarantees of the timing of when those jobs will be
called.
Signed-off-by: Benjamin Tissoires <[email protected]>
---
tools/testing/selftests/hid/hid_bpf.c | 57 ++++++++++++++++++++++
tools/testing/selftests/hid/progs/hid.c | 55 +++++++++++++++++++++
.../testing/selftests/hid/progs/hid_bpf_helpers.h | 2 +
3 files changed, 114 insertions(+)
diff --git a/tools/testing/selftests/hid/hid_bpf.c b/tools/testing/selftests/hid/hid_bpf.c
index 9ff02c737144..bb95ff90951b 100644
--- a/tools/testing/selftests/hid/hid_bpf.c
+++ b/tools/testing/selftests/hid/hid_bpf.c
@@ -875,6 +875,63 @@ TEST_F(hid_bpf, test_hid_user_raw_request_call)
ASSERT_EQ(args.data[1], 2);
}
+static __u8 workload_data;
+
+static int handle_event(void *ctx, void *data, size_t data_sz)
+{
+ const __u8 *e = data;
+
+ workload_data = *e;
+
+ return 0;
+}
+
+/*
+ * Call hid_bpf_schedule_delayed_work against the given uhid device,
+ * check that the program is called and does the expected.
+ */
+TEST_F(hid_bpf, test_hid_schedule_work)
+{
+ const struct test_program progs[] = {
+ { .name = "hid_defer_event" },
+ { .name = "hid_offload_notify" },
+ };
+ struct ring_buffer *rb = NULL;
+ __u8 buf[10] = {0};
+ __u32* delay;
+ int err;
+
+ LOAD_PROGRAMS(progs);
+
+ /* Set up ring buffer polling */
+ rb = ring_buffer__new(bpf_map__fd(self->skel->maps.rb), handle_event, NULL, NULL);
+ ASSERT_OK_PTR(rb) TH_LOG("Failed to create ring buffer");
+ ASSERT_EQ(workload_data, 0);
+
+ delay = (__u32 *)&buf[2];
+
+ /* inject one event */
+ buf[0] = 1;
+ buf[1] = 42; /* this will be placed in the ring buffer */
+ *delay = 0;
+ uhid_send_event(_metadata, self->uhid_fd, buf, 6);
+
+ err = ring_buffer__poll(rb, 100 /* timeout, ms */);
+ ASSERT_EQ(err, 1) TH_LOG("error while calling ring_buffer__poll");
+
+ ASSERT_EQ(workload_data, 42);
+
+ /* inject one another */
+ buf[0] = 1;
+ buf[1] = 53; /* this will be placed in the ring buffer */
+ *delay = 100;
+ uhid_send_event(_metadata, self->uhid_fd, buf, 6);
+
+ err = ring_buffer__poll(rb, 1000 /* timeout, ms */);
+ ASSERT_EQ(err, 1) TH_LOG("error while calling ring_buffer__poll");
+ ASSERT_EQ(workload_data, 53);
+}
+
/*
* Attach hid_insert{0,1,2} to the given uhid device,
* retrieve and open the matching hidraw node,
diff --git a/tools/testing/selftests/hid/progs/hid.c b/tools/testing/selftests/hid/progs/hid.c
index f67d35def142..95a03fb0494a 100644
--- a/tools/testing/selftests/hid/progs/hid.c
+++ b/tools/testing/selftests/hid/progs/hid.c
@@ -250,3 +250,58 @@ int BPF_PROG(hid_test_insert3, struct hid_bpf_ctx *hid_ctx)
return 0;
}
+
+struct test_report {
+ __u8 data[6];
+};
+
+struct {
+ __uint(type, BPF_MAP_TYPE_QUEUE);
+ __uint(max_entries, 8);
+ __type(value, struct test_report);
+} queue SEC(".maps");
+
+struct {
+ __uint(type, BPF_MAP_TYPE_RINGBUF);
+ __uint(max_entries, 8);
+} rb SEC(".maps");
+
+SEC("?fmod_ret.s/hid_bpf_offload")
+int BPF_PROG(hid_offload_notify, struct hid_bpf_ctx *hid_ctx)
+{
+ struct test_report buf;
+ __u8 *rb_elem;
+
+ if (bpf_map_pop_elem(&queue, &buf))
+ return 0;
+
+ rb_elem = bpf_ringbuf_reserve(&rb, sizeof(*rb_elem), 0);
+ if (!rb_elem)
+ return 0;
+
+ *rb_elem = buf.data[1];
+
+ bpf_ringbuf_submit(rb_elem, 0);
+
+ return 0;
+}
+
+SEC("?fmod_ret/hid_bpf_device_event")
+int BPF_PROG(hid_defer_event, struct hid_bpf_ctx *hctx)
+{
+ __u8 *data = hid_bpf_get_data(hctx, 0 /* offset */, 6 /* size */);
+ struct test_report buf;
+ __u32 delay;
+
+ if (!data)
+ return 0; /* EPERM check */
+
+ __builtin_memcpy(&buf.data, data, 6);
+
+ delay = *(__u32 *)&data[2];
+
+ bpf_map_push_elem(&queue, &buf, BPF_ANY);
+ hid_bpf_schedule_delayed_work(hctx, delay);
+
+ return -1; /* discard the event */
+}
diff --git a/tools/testing/selftests/hid/progs/hid_bpf_helpers.h b/tools/testing/selftests/hid/progs/hid_bpf_helpers.h
index 9cd56821d0f1..a844c7e89766 100644
--- a/tools/testing/selftests/hid/progs/hid_bpf_helpers.h
+++ b/tools/testing/selftests/hid/progs/hid_bpf_helpers.h
@@ -100,5 +100,7 @@ extern int hid_bpf_input_report(struct hid_bpf_ctx *ctx,
enum hid_report_type type,
__u8 *data,
size_t buf__sz) __ksym;
+extern bool hid_bpf_schedule_delayed_work(struct hid_bpf_ctx *ctx,
+ unsigned int delay_ms) __ksym;
#endif /* __HID_BPF_HELPERS_H */
--
2.43.0
It can be interesting to inject events from BPF as if the event were
to come from the device.
For example, some multitouch devices do not all the time send a proximity
out event, and we might want to send it for the physical device.
Compared to uhid, we can now inject events on any physical device, not
just uhid virtual ones.
Signed-off-by: Benjamin Tissoires <[email protected]>
---
Documentation/hid/hid-bpf.rst | 2 +-
drivers/hid/bpf/hid_bpf_dispatch.c | 29 +++++++++++++++++++++++++++++
drivers/hid/hid-core.c | 1 +
include/linux/hid_bpf.h | 2 ++
4 files changed, 33 insertions(+), 1 deletion(-)
diff --git a/Documentation/hid/hid-bpf.rst b/Documentation/hid/hid-bpf.rst
index a575004d9025..0765b3298ecf 100644
--- a/Documentation/hid/hid-bpf.rst
+++ b/Documentation/hid/hid-bpf.rst
@@ -179,7 +179,7 @@ Available API that can be used in syscall HID-BPF programs:
-----------------------------------------------------------
.. kernel-doc:: drivers/hid/bpf/hid_bpf_dispatch.c
- :functions: hid_bpf_attach_prog hid_bpf_hw_request hid_bpf_hw_output_report hid_bpf_allocate_context hid_bpf_release_context
+ :functions: hid_bpf_attach_prog hid_bpf_hw_request hid_bpf_hw_output_report hid_bpf_input_report hid_bpf_allocate_context hid_bpf_release_context
General overview of a HID-BPF program
=====================================
diff --git a/drivers/hid/bpf/hid_bpf_dispatch.c b/drivers/hid/bpf/hid_bpf_dispatch.c
index a5b88b491b80..e1a650f4a626 100644
--- a/drivers/hid/bpf/hid_bpf_dispatch.c
+++ b/drivers/hid/bpf/hid_bpf_dispatch.c
@@ -508,6 +508,34 @@ hid_bpf_hw_output_report(struct hid_bpf_ctx *ctx, __u8 *buf, size_t buf__sz)
kfree(dma_data);
return ret;
}
+
+/**
+ * hid_bpf_input_report - Inject a HID report in the kernel from a HID device
+ *
+ * @ctx: the HID-BPF context previously allocated in hid_bpf_allocate_context()
+ * @type: the type of the report (%HID_INPUT_REPORT, %HID_FEATURE_REPORT, %HID_OUTPUT_REPORT)
+ * @buf: a %PTR_TO_MEM buffer
+ * @buf__sz: the size of the data to transfer
+ *
+ * @returns %0 on success, a negative error code otherwise.
+ */
+__bpf_kfunc int
+hid_bpf_input_report(struct hid_bpf_ctx *ctx, enum hid_report_type type, u8 *buf,
+ const size_t buf__sz)
+{
+ struct hid_device *hdev;
+ size_t size = buf__sz;
+ int ret;
+
+ /* check arguments */
+ ret = __hid_bpf_hw_check_params(ctx, buf, &size, type);
+ if (ret)
+ return ret;
+
+ hdev = (struct hid_device *)ctx->hid; /* discard const */
+
+ return hid_input_report(hdev, type, buf, size, 0);
+}
__bpf_kfunc_end_defs();
/*
@@ -542,6 +570,7 @@ BTF_ID_FLAGS(func, hid_bpf_allocate_context, KF_ACQUIRE | KF_RET_NULL)
BTF_ID_FLAGS(func, hid_bpf_release_context, KF_RELEASE)
BTF_ID_FLAGS(func, hid_bpf_hw_request)
BTF_ID_FLAGS(func, hid_bpf_hw_output_report)
+BTF_ID_FLAGS(func, hid_bpf_input_report)
BTF_KFUNCS_END(hid_bpf_syscall_kfunc_ids)
static const struct btf_kfunc_id_set hid_bpf_syscall_kfunc_set = {
diff --git a/drivers/hid/hid-core.c b/drivers/hid/hid-core.c
index 1243595890ba..b1fa0378e8f4 100644
--- a/drivers/hid/hid-core.c
+++ b/drivers/hid/hid-core.c
@@ -2975,6 +2975,7 @@ static struct hid_bpf_ops hid_ops = {
.hid_get_report = hid_get_report,
.hid_hw_raw_request = hid_hw_raw_request,
.hid_hw_output_report = hid_hw_output_report,
+ .hid_input_report = hid_input_report,
.owner = THIS_MODULE,
.bus_type = &hid_bus_type,
};
diff --git a/include/linux/hid_bpf.h b/include/linux/hid_bpf.h
index 5c7ff93dc73e..17b08f500098 100644
--- a/include/linux/hid_bpf.h
+++ b/include/linux/hid_bpf.h
@@ -104,6 +104,8 @@ struct hid_bpf_ops {
size_t len, enum hid_report_type rtype,
enum hid_class_request reqtype);
int (*hid_hw_output_report)(struct hid_device *hdev, __u8 *buf, size_t len);
+ int (*hid_input_report)(struct hid_device *hid, enum hid_report_type type,
+ u8 *data, u32 size, int interrupt);
struct module *owner;
const struct bus_type *bus_type;
};
--
2.43.0
This allows to defer work for later for being able to:
1. defer an event:
Sometimes we receive an out of proximity event, but the device can not
be trusted enough, and we need to ensure that we won't receive another
one in the n following milliseconds. So we need to wait those n
milliseconds, and eventually re-inject that event in the stack.
2. inject new events in reaction to one given event:
We might want to transform one given event into several. This is the
case for macro keys where a single key press is supposed to send
a sequence of key presses. But this could also be used to patch a
faulty behavior, if a device forgets to send a release event
3. communicate with the device in reaction to one event:
We might want to communicate back to the device after a given event.
For example a device might send us an event saying that it came back
from sleep state and needs to be re-initialized.
Currently we can achieve that by keeping a userspace program around,
raise a bpf event, and let that userspace program inject the events and
commands.
However, we are just keeping that program alive as a daemon for just
scheduling commands, so there is no logic that would justify an actual
userspace wakeup. So a kernel workqueue is simpler to handle.
Signed-off-by: Benjamin Tissoires <[email protected]>
---
Documentation/hid/hid-bpf.rst | 24 ++-
drivers/hid/bpf/entrypoints/entrypoints.bpf.c | 8 +
drivers/hid/bpf/entrypoints/entrypoints.lskel.h | 236 +++++++++++++++---------
drivers/hid/bpf/hid_bpf_dispatch.c | 94 +++++++++-
drivers/hid/bpf/hid_bpf_jmp_table.c | 34 +++-
include/linux/hid_bpf.h | 7 +
6 files changed, 307 insertions(+), 96 deletions(-)
diff --git a/Documentation/hid/hid-bpf.rst b/Documentation/hid/hid-bpf.rst
index 0765b3298ecf..25b157fd8ed2 100644
--- a/Documentation/hid/hid-bpf.rst
+++ b/Documentation/hid/hid-bpf.rst
@@ -140,6 +140,7 @@ HID-BPF has the following attachment types available:
1. event processing/filtering with ``SEC("fmod_ret/hid_bpf_device_event")`` in libbpf
2. actions coming from userspace with ``SEC("syscall")`` in libbpf
3. change of the report descriptor with ``SEC("fmod_ret/hid_bpf_rdesc_fixup")`` in libbpf
+4. execute a delayed action with ``SEC("fmod_ret.s/hid_bpf_offload")`` in libbpf
A ``hid_bpf_device_event`` is calling a BPF program when an event is received from
the device. Thus we are in IRQ context and can act on the data or notify userspace.
@@ -149,6 +150,12 @@ A ``syscall`` means that userspace called the syscall ``BPF_PROG_RUN`` facility.
This time, we can do any operations allowed by HID-BPF, and talking to the device is
allowed.
+A ``hid_bpf_offload`` is called whenever a HID-BPF program attached to ``hid_bpf_device_event``
+is calling the kfunc ``hid_bpf_schedule_delayed_work``. The jobs are globally
+shared per HID device, this might be called more than expected, because another
+HID-BPF program might call ``hid_bpf_schedule_delayed_work``. So please ensure
+you actually have job scheduled.
+
Last, ``hid_bpf_rdesc_fixup`` is different from the others as there can be only one
BPF program of this type. This is called on ``probe`` from the driver and allows to
change the report descriptor from the BPF program. Once a ``hid_bpf_rdesc_fixup``
@@ -167,13 +174,13 @@ Available tracing functions to attach a HID-BPF program:
--------------------------------------------------------
.. kernel-doc:: drivers/hid/bpf/hid_bpf_dispatch.c
- :functions: hid_bpf_device_event hid_bpf_rdesc_fixup
+ :functions: hid_bpf_device_event hid_bpf_rdesc_fixup hid_bpf_offload
Available API that can be used in all HID-BPF programs:
-------------------------------------------------------
.. kernel-doc:: drivers/hid/bpf/hid_bpf_dispatch.c
- :functions: hid_bpf_get_data
+ :functions: hid_bpf_get_data hid_bpf_schedule_delayed_work
Available API that can be used in syscall HID-BPF programs:
-----------------------------------------------------------
@@ -181,6 +188,19 @@ Available API that can be used in syscall HID-BPF programs:
.. kernel-doc:: drivers/hid/bpf/hid_bpf_dispatch.c
:functions: hid_bpf_attach_prog hid_bpf_hw_request hid_bpf_hw_output_report hid_bpf_input_report hid_bpf_allocate_context hid_bpf_release_context
+Available API that can be used in sleepable HID-BPF programs:
+-------------------------------------------------------------
+
+When using a sleepable HID-BPF program (attached with ``SEC("fmod_ret.s/hid_bpf_offload")``),
+you can use the following functions:
+
+- ``hid_bpf_hw_request``
+- ``hid_bpf_hw_output_report``
+- ``hid_bpf_input_report``
+- ``hid_bpf_schedule_delayed_work``
+- ``hid_bpf_allocate_context``
+- ``hid_bpf_release_context``
+
General overview of a HID-BPF program
=====================================
diff --git a/drivers/hid/bpf/entrypoints/entrypoints.bpf.c b/drivers/hid/bpf/entrypoints/entrypoints.bpf.c
index c22921125a1a..fe7d5c808034 100644
--- a/drivers/hid/bpf/entrypoints/entrypoints.bpf.c
+++ b/drivers/hid/bpf/entrypoints/entrypoints.bpf.c
@@ -22,4 +22,12 @@ int BPF_PROG(hid_tail_call, struct hid_bpf_ctx *hctx)
return 0;
}
+SEC("fmod_ret.s/__hid_bpf_tail_call_sleepable")
+int BPF_PROG(hid_tail_call_sleepable, struct hid_bpf_ctx *hctx)
+{
+ bpf_tail_call(ctx, &hid_jmp_table, hctx->index);
+
+ return 0;
+}
+
char LICENSE[] SEC("license") = "GPL";
diff --git a/drivers/hid/bpf/entrypoints/entrypoints.lskel.h b/drivers/hid/bpf/entrypoints/entrypoints.lskel.h
index 35618051598c..8883c436ab8c 100644
--- a/drivers/hid/bpf/entrypoints/entrypoints.lskel.h
+++ b/drivers/hid/bpf/entrypoints/entrypoints.lskel.h
@@ -12,9 +12,11 @@ struct entrypoints_bpf {
} maps;
struct {
struct bpf_prog_desc hid_tail_call;
+ struct bpf_prog_desc hid_tail_call_sleepable;
} progs;
struct {
int hid_tail_call_fd;
+ int hid_tail_call_sleepable_fd;
} links;
};
@@ -29,12 +31,24 @@ entrypoints_bpf__hid_tail_call__attach(struct entrypoints_bpf *skel)
return fd;
}
+static inline int
+entrypoints_bpf__hid_tail_call_sleepable__attach(struct entrypoints_bpf *skel)
+{
+ int prog_fd = skel->progs.hid_tail_call_sleepable.prog_fd;
+ int fd = skel_raw_tracepoint_open(NULL, prog_fd);
+
+ if (fd > 0)
+ skel->links.hid_tail_call_sleepable_fd = fd;
+ return fd;
+}
+
static inline int
entrypoints_bpf__attach(struct entrypoints_bpf *skel)
{
int ret = 0;
ret = ret < 0 ? ret : entrypoints_bpf__hid_tail_call__attach(skel);
+ ret = ret < 0 ? ret : entrypoints_bpf__hid_tail_call_sleepable__attach(skel);
return ret < 0 ? ret : 0;
}
@@ -42,6 +56,7 @@ static inline void
entrypoints_bpf__detach(struct entrypoints_bpf *skel)
{
skel_closenz(skel->links.hid_tail_call_fd);
+ skel_closenz(skel->links.hid_tail_call_sleepable_fd);
}
static void
entrypoints_bpf__destroy(struct entrypoints_bpf *skel)
@@ -50,6 +65,7 @@ entrypoints_bpf__destroy(struct entrypoints_bpf *skel)
return;
entrypoints_bpf__detach(skel);
skel_closenz(skel->progs.hid_tail_call.prog_fd);
+ skel_closenz(skel->progs.hid_tail_call_sleepable.prog_fd);
skel_closenz(skel->maps.hid_jmp_table.map_fd);
skel_free(skel);
}
@@ -73,10 +89,7 @@ entrypoints_bpf__load(struct entrypoints_bpf *skel)
{
struct bpf_load_and_run_opts opts = {};
int err;
-
- opts.ctx = (struct bpf_loader_ctx *)skel;
- opts.data_sz = 2856;
- opts.data = (void *)"\
+ static const char opts_data[] __attribute__((__aligned__(8))) = "\
\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\
\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\
\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\
@@ -110,7 +123,7 @@ entrypoints_bpf__load(struct entrypoints_bpf *skel)
\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\
\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\
\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\x9f\xeb\x01\0\
-\x18\0\0\0\0\0\0\0\x60\x02\0\0\x60\x02\0\0\x12\x02\0\0\0\0\0\0\0\0\0\x02\x03\0\
+\x18\0\0\0\0\0\0\0\x80\x02\0\0\x80\x02\0\0\x93\x02\0\0\0\0\0\0\0\0\0\x02\x03\0\
\0\0\x01\0\0\0\0\0\0\x01\x04\0\0\0\x20\0\0\x01\0\0\0\0\0\0\0\x03\0\0\0\0\x02\0\
\0\0\x04\0\0\0\x03\0\0\0\x05\0\0\0\0\0\0\x01\x04\0\0\0\x20\0\0\0\0\0\0\0\0\0\0\
\x02\x06\0\0\0\0\0\0\0\0\0\0\x03\0\0\0\0\x02\0\0\0\x04\0\0\0\0\x04\0\0\0\0\0\0\
@@ -123,96 +136,141 @@ entrypoints_bpf__load(struct entrypoints_bpf *skel)
\0\0\0\0\0\0\0\x1b\x01\0\0\x12\0\0\0\x40\0\0\0\x1f\x01\0\0\x10\0\0\0\x80\0\0\0\
\x2e\x01\0\0\x14\0\0\0\xa0\0\0\0\0\0\0\0\x15\0\0\0\xc0\0\0\0\x3a\x01\0\0\0\0\0\
\x08\x11\0\0\0\x40\x01\0\0\0\0\0\x01\x04\0\0\0\x20\0\0\0\0\0\0\0\0\0\0\x02\x13\
-\0\0\0\0\0\0\0\0\0\0\x0a\x1c\0\0\0\x4d\x01\0\0\x04\0\0\x06\x04\0\0\0\x5d\x01\0\
+\0\0\0\0\0\0\0\0\0\0\x0a\x1e\0\0\0\x4d\x01\0\0\x04\0\0\x06\x04\0\0\0\x5d\x01\0\
\0\0\0\0\0\x6e\x01\0\0\x01\0\0\0\x80\x01\0\0\x02\0\0\0\x93\x01\0\0\x03\0\0\0\0\
\0\0\0\x02\0\0\x05\x04\0\0\0\xa4\x01\0\0\x16\0\0\0\0\0\0\0\xab\x01\0\0\x16\0\0\
-\0\0\0\0\0\xb0\x01\0\0\0\0\0\x08\x02\0\0\0\xec\x01\0\0\0\0\0\x01\x01\0\0\0\x08\
-\0\0\x01\0\0\0\0\0\0\0\x03\0\0\0\0\x17\0\0\0\x04\0\0\0\x04\0\0\0\xf1\x01\0\0\0\
-\0\0\x0e\x18\0\0\0\x01\0\0\0\xf9\x01\0\0\x01\0\0\x0f\x20\0\0\0\x0a\0\0\0\0\0\0\
-\0\x20\0\0\0\xff\x01\0\0\x01\0\0\x0f\x04\0\0\0\x19\0\0\0\0\0\0\0\x04\0\0\0\x07\
-\x02\0\0\0\0\0\x07\0\0\0\0\0\x69\x6e\x74\0\x5f\x5f\x41\x52\x52\x41\x59\x5f\x53\
-\x49\x5a\x45\x5f\x54\x59\x50\x45\x5f\x5f\0\x74\x79\x70\x65\0\x6d\x61\x78\x5f\
-\x65\x6e\x74\x72\x69\x65\x73\0\x6b\x65\x79\x5f\x73\x69\x7a\x65\0\x76\x61\x6c\
-\x75\x65\x5f\x73\x69\x7a\x65\0\x68\x69\x64\x5f\x6a\x6d\x70\x5f\x74\x61\x62\x6c\
-\x65\0\x75\x6e\x73\x69\x67\x6e\x65\x64\x20\x6c\x6f\x6e\x67\x20\x6c\x6f\x6e\x67\
-\0\x63\x74\x78\0\x68\x69\x64\x5f\x74\x61\x69\x6c\x5f\x63\x61\x6c\x6c\0\x66\x6d\
-\x6f\x64\x5f\x72\x65\x74\x2f\x5f\x5f\x68\x69\x64\x5f\x62\x70\x66\x5f\x74\x61\
-\x69\x6c\x5f\x63\x61\x6c\x6c\0\x2f\x68\x6f\x6d\x65\x2f\x62\x74\x69\x73\x73\x6f\
-\x69\x72\x2f\x53\x72\x63\x2f\x68\x69\x64\x2f\x64\x72\x69\x76\x65\x72\x73\x2f\
-\x68\x69\x64\x2f\x62\x70\x66\x2f\x65\x6e\x74\x72\x79\x70\x6f\x69\x6e\x74\x73\
-\x2f\x65\x6e\x74\x72\x79\x70\x6f\x69\x6e\x74\x73\x2e\x62\x70\x66\x2e\x63\0\x69\
-\x6e\x74\x20\x42\x50\x46\x5f\x50\x52\x4f\x47\x28\x68\x69\x64\x5f\x74\x61\x69\
-\x6c\x5f\x63\x61\x6c\x6c\x2c\x20\x73\x74\x72\x75\x63\x74\x20\x68\x69\x64\x5f\
-\x62\x70\x66\x5f\x63\x74\x78\x20\x2a\x68\x63\x74\x78\x29\0\x68\x69\x64\x5f\x62\
-\x70\x66\x5f\x63\x74\x78\0\x69\x6e\x64\x65\x78\0\x68\x69\x64\0\x61\x6c\x6c\x6f\
-\x63\x61\x74\x65\x64\x5f\x73\x69\x7a\x65\0\x72\x65\x70\x6f\x72\x74\x5f\x74\x79\
-\x70\x65\0\x5f\x5f\x75\x33\x32\0\x75\x6e\x73\x69\x67\x6e\x65\x64\x20\x69\x6e\
-\x74\0\x68\x69\x64\x5f\x72\x65\x70\x6f\x72\x74\x5f\x74\x79\x70\x65\0\x48\x49\
-\x44\x5f\x49\x4e\x50\x55\x54\x5f\x52\x45\x50\x4f\x52\x54\0\x48\x49\x44\x5f\x4f\
-\x55\x54\x50\x55\x54\x5f\x52\x45\x50\x4f\x52\x54\0\x48\x49\x44\x5f\x46\x45\x41\
-\x54\x55\x52\x45\x5f\x52\x45\x50\x4f\x52\x54\0\x48\x49\x44\x5f\x52\x45\x50\x4f\
-\x52\x54\x5f\x54\x59\x50\x45\x53\0\x72\x65\x74\x76\x61\x6c\0\x73\x69\x7a\x65\0\
-\x5f\x5f\x73\x33\x32\0\x30\x3a\x30\0\x09\x62\x70\x66\x5f\x74\x61\x69\x6c\x5f\
-\x63\x61\x6c\x6c\x28\x63\x74\x78\x2c\x20\x26\x68\x69\x64\x5f\x6a\x6d\x70\x5f\
-\x74\x61\x62\x6c\x65\x2c\x20\x68\x63\x74\x78\x2d\x3e\x69\x6e\x64\x65\x78\x29\
-\x3b\0\x63\x68\x61\x72\0\x4c\x49\x43\x45\x4e\x53\x45\0\x2e\x6d\x61\x70\x73\0\
-\x6c\x69\x63\x65\x6e\x73\x65\0\x68\x69\x64\x5f\x64\x65\x76\x69\x63\x65\0\0\0\0\
-\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\x8a\x04\0\0\0\0\0\0\0\0\0\0\0\0\0\0\x03\
-\0\0\0\x04\0\0\0\x04\0\0\0\0\x04\0\0\0\0\0\0\0\0\0\0\0\0\0\0\x68\x69\x64\x5f\
-\x6a\x6d\x70\x5f\x74\x61\x62\x6c\x65\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\
-\0\0\0\0\0\0\0\0\0\0\x47\x50\x4c\0\0\0\0\0\x79\x12\0\0\0\0\0\0\x61\x23\0\0\0\0\
-\0\0\x18\x52\0\0\0\0\0\0\0\0\0\0\0\0\0\0\x85\0\0\0\x0c\0\0\0\xb7\0\0\0\0\0\0\0\
-\x95\0\0\0\0\0\0\0\0\0\0\0\x0e\0\0\0\0\0\0\0\x8e\0\0\0\xd3\0\0\0\x05\x48\0\0\
-\x01\0\0\0\x8e\0\0\0\xba\x01\0\0\x02\x50\0\0\x05\0\0\0\x8e\0\0\0\xd3\0\0\0\x05\
-\x48\0\0\x08\0\0\0\x0f\0\0\0\xb6\x01\0\0\0\0\0\0\x1a\0\0\0\x07\0\0\0\0\0\0\0\0\
-\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\x68\x69\
-\x64\x5f\x74\x61\x69\x6c\x5f\x63\x61\x6c\x6c\0\0\0\0\0\0\0\x1a\0\0\0\0\0\0\0\
-\x08\0\0\0\0\0\0\0\0\0\0\0\x01\0\0\0\x10\0\0\0\0\0\0\0\0\0\0\0\x03\0\0\0\x01\0\
-\0\0\0\0\0\0\x01\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\x10\0\0\0\0\0\0\0\x5f\
-\x5f\x68\x69\x64\x5f\x62\x70\x66\x5f\x74\x61\x69\x6c\x5f\x63\x61\x6c\x6c\0\0\0\
+\0\0\0\0\0\xb0\x01\0\0\0\0\0\x08\x02\0\0\0\0\0\0\0\x01\0\0\x0d\x02\0\0\0\x5f\0\
+\0\0\x0b\0\0\0\xec\x01\0\0\x01\0\0\x0c\x17\0\0\0\x6d\x02\0\0\0\0\0\x01\x01\0\0\
+\0\x08\0\0\x01\0\0\0\0\0\0\0\x03\0\0\0\0\x19\0\0\0\x04\0\0\0\x04\0\0\0\x72\x02\
+\0\0\0\0\0\x0e\x1a\0\0\0\x01\0\0\0\x7a\x02\0\0\x01\0\0\x0f\x20\0\0\0\x0a\0\0\0\
+\0\0\0\0\x20\0\0\0\x80\x02\0\0\x01\0\0\x0f\x04\0\0\0\x1b\0\0\0\0\0\0\0\x04\0\0\
+\0\x88\x02\0\0\0\0\0\x07\0\0\0\0\0\x69\x6e\x74\0\x5f\x5f\x41\x52\x52\x41\x59\
+\x5f\x53\x49\x5a\x45\x5f\x54\x59\x50\x45\x5f\x5f\0\x74\x79\x70\x65\0\x6d\x61\
+\x78\x5f\x65\x6e\x74\x72\x69\x65\x73\0\x6b\x65\x79\x5f\x73\x69\x7a\x65\0\x76\
+\x61\x6c\x75\x65\x5f\x73\x69\x7a\x65\0\x68\x69\x64\x5f\x6a\x6d\x70\x5f\x74\x61\
+\x62\x6c\x65\0\x75\x6e\x73\x69\x67\x6e\x65\x64\x20\x6c\x6f\x6e\x67\x20\x6c\x6f\
+\x6e\x67\0\x63\x74\x78\0\x68\x69\x64\x5f\x74\x61\x69\x6c\x5f\x63\x61\x6c\x6c\0\
+\x66\x6d\x6f\x64\x5f\x72\x65\x74\x2f\x5f\x5f\x68\x69\x64\x5f\x62\x70\x66\x5f\
+\x74\x61\x69\x6c\x5f\x63\x61\x6c\x6c\0\x2f\x68\x6f\x6d\x65\x2f\x62\x74\x69\x73\
+\x73\x6f\x69\x72\x2f\x53\x72\x63\x2f\x68\x69\x64\x2f\x64\x72\x69\x76\x65\x72\
+\x73\x2f\x68\x69\x64\x2f\x62\x70\x66\x2f\x65\x6e\x74\x72\x79\x70\x6f\x69\x6e\
+\x74\x73\x2f\x65\x6e\x74\x72\x79\x70\x6f\x69\x6e\x74\x73\x2e\x62\x70\x66\x2e\
+\x63\0\x69\x6e\x74\x20\x42\x50\x46\x5f\x50\x52\x4f\x47\x28\x68\x69\x64\x5f\x74\
+\x61\x69\x6c\x5f\x63\x61\x6c\x6c\x2c\x20\x73\x74\x72\x75\x63\x74\x20\x68\x69\
+\x64\x5f\x62\x70\x66\x5f\x63\x74\x78\x20\x2a\x68\x63\x74\x78\x29\0\x68\x69\x64\
+\x5f\x62\x70\x66\x5f\x63\x74\x78\0\x69\x6e\x64\x65\x78\0\x68\x69\x64\0\x61\x6c\
+\x6c\x6f\x63\x61\x74\x65\x64\x5f\x73\x69\x7a\x65\0\x72\x65\x70\x6f\x72\x74\x5f\
+\x74\x79\x70\x65\0\x5f\x5f\x75\x33\x32\0\x75\x6e\x73\x69\x67\x6e\x65\x64\x20\
+\x69\x6e\x74\0\x68\x69\x64\x5f\x72\x65\x70\x6f\x72\x74\x5f\x74\x79\x70\x65\0\
+\x48\x49\x44\x5f\x49\x4e\x50\x55\x54\x5f\x52\x45\x50\x4f\x52\x54\0\x48\x49\x44\
+\x5f\x4f\x55\x54\x50\x55\x54\x5f\x52\x45\x50\x4f\x52\x54\0\x48\x49\x44\x5f\x46\
+\x45\x41\x54\x55\x52\x45\x5f\x52\x45\x50\x4f\x52\x54\0\x48\x49\x44\x5f\x52\x45\
+\x50\x4f\x52\x54\x5f\x54\x59\x50\x45\x53\0\x72\x65\x74\x76\x61\x6c\0\x73\x69\
+\x7a\x65\0\x5f\x5f\x73\x33\x32\0\x30\x3a\x30\0\x09\x62\x70\x66\x5f\x74\x61\x69\
+\x6c\x5f\x63\x61\x6c\x6c\x28\x63\x74\x78\x2c\x20\x26\x68\x69\x64\x5f\x6a\x6d\
+\x70\x5f\x74\x61\x62\x6c\x65\x2c\x20\x68\x63\x74\x78\x2d\x3e\x69\x6e\x64\x65\
+\x78\x29\x3b\0\x68\x69\x64\x5f\x74\x61\x69\x6c\x5f\x63\x61\x6c\x6c\x5f\x73\x6c\
+\x65\x65\x70\x61\x62\x6c\x65\0\x66\x6d\x6f\x64\x5f\x72\x65\x74\x2e\x73\x2f\x5f\
+\x5f\x68\x69\x64\x5f\x62\x70\x66\x5f\x74\x61\x69\x6c\x5f\x63\x61\x6c\x6c\x5f\
+\x73\x6c\x65\x65\x70\x61\x62\x6c\x65\0\x69\x6e\x74\x20\x42\x50\x46\x5f\x50\x52\
+\x4f\x47\x28\x68\x69\x64\x5f\x74\x61\x69\x6c\x5f\x63\x61\x6c\x6c\x5f\x73\x6c\
+\x65\x65\x70\x61\x62\x6c\x65\x2c\x20\x73\x74\x72\x75\x63\x74\x20\x68\x69\x64\
+\x5f\x62\x70\x66\x5f\x63\x74\x78\x20\x2a\x68\x63\x74\x78\x29\0\x63\x68\x61\x72\
+\0\x4c\x49\x43\x45\x4e\x53\x45\0\x2e\x6d\x61\x70\x73\0\x6c\x69\x63\x65\x6e\x73\
+\x65\0\x68\x69\x64\x5f\x64\x65\x76\x69\x63\x65\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\
+\0\0\0\0\0\0\x2b\x05\0\0\0\0\0\0\0\0\0\0\0\0\0\0\x03\0\0\0\x04\0\0\0\x04\0\0\0\
+\0\x04\0\0\0\0\0\0\0\0\0\0\0\0\0\0\x68\x69\x64\x5f\x6a\x6d\x70\x5f\x74\x61\x62\
+\x6c\x65\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\x47\x50\
+\x4c\0\0\0\0\0\x79\x12\0\0\0\0\0\0\x61\x23\0\0\0\0\0\0\x18\x52\0\0\0\0\0\0\0\0\
+\0\0\0\0\0\0\x85\0\0\0\x0c\0\0\0\xb7\0\0\0\0\0\0\0\x95\0\0\0\0\0\0\0\0\0\0\0\
+\x0e\0\0\0\0\0\0\0\x8e\0\0\0\xd3\0\0\0\x05\x48\0\0\x01\0\0\0\x8e\0\0\0\xba\x01\
+\0\0\x02\x50\0\0\x05\0\0\0\x8e\0\0\0\xd3\0\0\0\x05\x48\0\0\x08\0\0\0\x0f\0\0\0\
+\xb6\x01\0\0\0\0\0\0\x1a\0\0\0\x07\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\
+\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\x68\x69\x64\x5f\x74\x61\x69\x6c\x5f\
+\x63\x61\x6c\x6c\0\0\0\0\0\0\0\x1a\0\0\0\0\0\0\0\x08\0\0\0\0\0\0\0\0\0\0\0\x01\
+\0\0\0\x10\0\0\0\0\0\0\0\0\0\0\0\x03\0\0\0\x01\0\0\0\0\0\0\0\x01\0\0\0\0\0\0\0\
+\0\0\0\0\0\0\0\0\0\0\0\0\x10\0\0\0\0\0\0\0\x5f\x5f\x68\x69\x64\x5f\x62\x70\x66\
+\x5f\x74\x61\x69\x6c\x5f\x63\x61\x6c\x6c\0\0\0\0\0\x47\x50\x4c\0\0\0\0\0\x79\
+\x12\0\0\0\0\0\0\x61\x23\0\0\0\0\0\0\x18\x52\0\0\0\0\0\0\0\0\0\0\0\0\0\0\x85\0\
+\0\0\x0c\0\0\0\xb7\0\0\0\0\0\0\0\x95\0\0\0\0\0\0\0\0\0\0\0\x18\0\0\0\0\0\0\0\
+\x8e\0\0\0\x2d\x02\0\0\x05\x68\0\0\x01\0\0\0\x8e\0\0\0\xba\x01\0\0\x02\x70\0\0\
+\x05\0\0\0\x8e\0\0\0\x2d\x02\0\0\x05\x68\0\0\x08\0\0\0\x0f\0\0\0\xb6\x01\0\0\0\
+\0\0\0\x1a\0\0\0\x07\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\
+\0\0\0\0\0\0\0\0\0\0\x10\0\0\0\x68\x69\x64\x5f\x74\x61\x69\x6c\x5f\x63\x61\x6c\
+\x6c\x5f\x73\0\0\0\0\0\x1a\0\0\0\0\0\0\0\x08\0\0\0\0\0\0\0\0\0\0\0\x01\0\0\0\
+\x10\0\0\0\0\0\0\0\0\0\0\0\x03\0\0\0\x01\0\0\0\0\0\0\0\x01\0\0\0\0\0\0\0\0\0\0\
+\0\0\0\0\0\0\0\0\0\x10\0\0\0\0\0\0\0\x5f\x5f\x68\x69\x64\x5f\x62\x70\x66\x5f\
+\x74\x61\x69\x6c\x5f\x63\x61\x6c\x6c\x5f\x73\x6c\x65\x65\x70\x61\x62\x6c\x65\0\
\0\0";
- opts.insns_sz = 1192;
- opts.insns = (void *)"\
+ static const char opts_insn[] __attribute__((__aligned__(8))) = "\
\xbf\x16\0\0\0\0\0\0\xbf\xa1\0\0\0\0\0\0\x07\x01\0\0\x78\xff\xff\xff\xb7\x02\0\
-\0\x88\0\0\0\xb7\x03\0\0\0\0\0\0\x85\0\0\0\x71\0\0\0\x05\0\x11\0\0\0\0\0\x61\
+\0\x88\0\0\0\xb7\x03\0\0\0\0\0\0\x85\0\0\0\x71\0\0\0\x05\0\x14\0\0\0\0\0\x61\
\xa1\x78\xff\0\0\0\0\xd5\x01\x01\0\0\0\0\0\x85\0\0\0\xa8\0\0\0\x61\xa1\x7c\xff\
\0\0\0\0\xd5\x01\x01\0\0\0\0\0\x85\0\0\0\xa8\0\0\0\x61\xa1\x80\xff\0\0\0\0\xd5\
-\x01\x01\0\0\0\0\0\x85\0\0\0\xa8\0\0\0\x18\x60\0\0\0\0\0\0\0\0\0\0\0\0\0\0\x61\
-\x01\0\0\0\0\0\0\xd5\x01\x02\0\0\0\0\0\xbf\x19\0\0\0\0\0\0\x85\0\0\0\xa8\0\0\0\
-\xbf\x70\0\0\0\0\0\0\x95\0\0\0\0\0\0\0\x61\x60\x08\0\0\0\0\0\x18\x61\0\0\0\0\0\
-\0\0\0\0\0\xa8\x09\0\0\x63\x01\0\0\0\0\0\0\x61\x60\x0c\0\0\0\0\0\x18\x61\0\0\0\
-\0\0\0\0\0\0\0\xa4\x09\0\0\x63\x01\0\0\0\0\0\0\x79\x60\x10\0\0\0\0\0\x18\x61\0\
-\0\0\0\0\0\0\0\0\0\x98\x09\0\0\x7b\x01\0\0\0\0\0\0\x18\x60\0\0\0\0\0\0\0\0\0\0\
-\0\x05\0\0\x18\x61\0\0\0\0\0\0\0\0\0\0\x90\x09\0\0\x7b\x01\0\0\0\0\0\0\xb7\x01\
-\0\0\x12\0\0\0\x18\x62\0\0\0\0\0\0\0\0\0\0\x90\x09\0\0\xb7\x03\0\0\x1c\0\0\0\
-\x85\0\0\0\xa6\0\0\0\xbf\x07\0\0\0\0\0\0\xc5\x07\xd7\xff\0\0\0\0\x63\x7a\x78\
-\xff\0\0\0\0\x61\x60\x1c\0\0\0\0\0\x15\0\x03\0\0\0\0\0\x18\x61\0\0\0\0\0\0\0\0\
-\0\0\xbc\x09\0\0\x63\x01\0\0\0\0\0\0\xb7\x01\0\0\0\0\0\0\x18\x62\0\0\0\0\0\0\0\
-\0\0\0\xb0\x09\0\0\xb7\x03\0\0\x48\0\0\0\x85\0\0\0\xa6\0\0\0\xbf\x07\0\0\0\0\0\
-\0\xc5\x07\xca\xff\0\0\0\0\x18\x61\0\0\0\0\0\0\0\0\0\0\0\0\0\0\x63\x71\0\0\0\0\
-\0\0\x18\x60\0\0\0\0\0\0\0\0\0\0\xf8\x09\0\0\x18\x61\0\0\0\0\0\0\0\0\0\0\x90\
-\x0a\0\0\x7b\x01\0\0\0\0\0\0\x18\x60\0\0\0\0\0\0\0\0\0\0\0\x0a\0\0\x18\x61\0\0\
-\0\0\0\0\0\0\0\0\x88\x0a\0\0\x7b\x01\0\0\0\0\0\0\x18\x60\0\0\0\0\0\0\0\0\0\0\
-\x38\x0a\0\0\x18\x61\0\0\0\0\0\0\0\0\0\0\xd0\x0a\0\0\x7b\x01\0\0\0\0\0\0\x18\
-\x60\0\0\0\0\0\0\0\0\0\0\x40\x0a\0\0\x18\x61\0\0\0\0\0\0\0\0\0\0\xe0\x0a\0\0\
-\x7b\x01\0\0\0\0\0\0\x18\x60\0\0\0\0\0\0\0\0\0\0\x70\x0a\0\0\x18\x61\0\0\0\0\0\
-\0\0\0\0\0\0\x0b\0\0\x7b\x01\0\0\0\0\0\0\x18\x60\0\0\0\0\0\0\0\0\0\0\0\0\0\0\
-\x18\x61\0\0\0\0\0\0\0\0\0\0\xf8\x0a\0\0\x7b\x01\0\0\0\0\0\0\x61\x60\x08\0\0\0\
-\0\0\x18\x61\0\0\0\0\0\0\0\0\0\0\x98\x0a\0\0\x63\x01\0\0\0\0\0\0\x61\x60\x0c\0\
-\0\0\0\0\x18\x61\0\0\0\0\0\0\0\0\0\0\x9c\x0a\0\0\x63\x01\0\0\0\0\0\0\x79\x60\
-\x10\0\0\0\0\0\x18\x61\0\0\0\0\0\0\0\0\0\0\xa0\x0a\0\0\x7b\x01\0\0\0\0\0\0\x61\
-\xa0\x78\xff\0\0\0\0\x18\x61\0\0\0\0\0\0\0\0\0\0\xc8\x0a\0\0\x63\x01\0\0\0\0\0\
-\0\x18\x61\0\0\0\0\0\0\0\0\0\0\x10\x0b\0\0\xb7\x02\0\0\x14\0\0\0\xb7\x03\0\0\
-\x0c\0\0\0\xb7\x04\0\0\0\0\0\0\x85\0\0\0\xa7\0\0\0\xbf\x07\0\0\0\0\0\0\xc5\x07\
-\x91\xff\0\0\0\0\x18\x60\0\0\0\0\0\0\0\0\0\0\x80\x0a\0\0\x63\x70\x6c\0\0\0\0\0\
-\x77\x07\0\0\x20\0\0\0\x63\x70\x70\0\0\0\0\0\xb7\x01\0\0\x05\0\0\0\x18\x62\0\0\
-\0\0\0\0\0\0\0\0\x80\x0a\0\0\xb7\x03\0\0\x8c\0\0\0\x85\0\0\0\xa6\0\0\0\xbf\x07\
-\0\0\0\0\0\0\x18\x60\0\0\0\0\0\0\0\0\0\0\xf0\x0a\0\0\x61\x01\0\0\0\0\0\0\xd5\
-\x01\x02\0\0\0\0\0\xbf\x19\0\0\0\0\0\0\x85\0\0\0\xa8\0\0\0\xc5\x07\x7f\xff\0\0\
-\0\0\x63\x7a\x80\xff\0\0\0\0\x61\xa1\x78\xff\0\0\0\0\xd5\x01\x02\0\0\0\0\0\xbf\
-\x19\0\0\0\0\0\0\x85\0\0\0\xa8\0\0\0\x61\xa0\x80\xff\0\0\0\0\x63\x06\x28\0\0\0\
-\0\0\x18\x61\0\0\0\0\0\0\0\0\0\0\0\0\0\0\x61\x10\0\0\0\0\0\0\x63\x06\x18\0\0\0\
-\0\0\xb7\0\0\0\0\0\0\0\x95\0\0\0\0\0\0\0";
+\x01\x01\0\0\0\0\0\x85\0\0\0\xa8\0\0\0\x61\xa1\x84\xff\0\0\0\0\xd5\x01\x01\0\0\
+\0\0\0\x85\0\0\0\xa8\0\0\0\x18\x60\0\0\0\0\0\0\0\0\0\0\0\0\0\0\x61\x01\0\0\0\0\
+\0\0\xd5\x01\x02\0\0\0\0\0\xbf\x19\0\0\0\0\0\0\x85\0\0\0\xa8\0\0\0\xbf\x70\0\0\
+\0\0\0\0\x95\0\0\0\0\0\0\0\x61\x60\x08\0\0\0\0\0\x18\x61\0\0\0\0\0\0\0\0\0\0\
+\x48\x0a\0\0\x63\x01\0\0\0\0\0\0\x61\x60\x0c\0\0\0\0\0\x18\x61\0\0\0\0\0\0\0\0\
+\0\0\x44\x0a\0\0\x63\x01\0\0\0\0\0\0\x79\x60\x10\0\0\0\0\0\x18\x61\0\0\0\0\0\0\
+\0\0\0\0\x38\x0a\0\0\x7b\x01\0\0\0\0\0\0\x18\x60\0\0\0\0\0\0\0\0\0\0\0\x05\0\0\
+\x18\x61\0\0\0\0\0\0\0\0\0\0\x30\x0a\0\0\x7b\x01\0\0\0\0\0\0\xb7\x01\0\0\x12\0\
+\0\0\x18\x62\0\0\0\0\0\0\0\0\0\0\x30\x0a\0\0\xb7\x03\0\0\x1c\0\0\0\x85\0\0\0\
+\xa6\0\0\0\xbf\x07\0\0\0\0\0\0\xc5\x07\xd4\xff\0\0\0\0\x63\x7a\x78\xff\0\0\0\0\
+\x61\x60\x1c\0\0\0\0\0\x15\0\x03\0\0\0\0\0\x18\x61\0\0\0\0\0\0\0\0\0\0\x5c\x0a\
+\0\0\x63\x01\0\0\0\0\0\0\xb7\x01\0\0\0\0\0\0\x18\x62\0\0\0\0\0\0\0\0\0\0\x50\
+\x0a\0\0\xb7\x03\0\0\x48\0\0\0\x85\0\0\0\xa6\0\0\0\xbf\x07\0\0\0\0\0\0\xc5\x07\
+\xc7\xff\0\0\0\0\x18\x61\0\0\0\0\0\0\0\0\0\0\0\0\0\0\x63\x71\0\0\0\0\0\0\x18\
+\x60\0\0\0\0\0\0\0\0\0\0\x98\x0a\0\0\x18\x61\0\0\0\0\0\0\0\0\0\0\x30\x0b\0\0\
+\x7b\x01\0\0\0\0\0\0\x18\x60\0\0\0\0\0\0\0\0\0\0\xa0\x0a\0\0\x18\x61\0\0\0\0\0\
+\0\0\0\0\0\x28\x0b\0\0\x7b\x01\0\0\0\0\0\0\x18\x60\0\0\0\0\0\0\0\0\0\0\xd8\x0a\
+\0\0\x18\x61\0\0\0\0\0\0\0\0\0\0\x70\x0b\0\0\x7b\x01\0\0\0\0\0\0\x18\x60\0\0\0\
+\0\0\0\0\0\0\0\xe0\x0a\0\0\x18\x61\0\0\0\0\0\0\0\0\0\0\x80\x0b\0\0\x7b\x01\0\0\
+\0\0\0\0\x18\x60\0\0\0\0\0\0\0\0\0\0\x10\x0b\0\0\x18\x61\0\0\0\0\0\0\0\0\0\0\
+\xa0\x0b\0\0\x7b\x01\0\0\0\0\0\0\x18\x60\0\0\0\0\0\0\0\0\0\0\0\0\0\0\x18\x61\0\
+\0\0\0\0\0\0\0\0\0\x98\x0b\0\0\x7b\x01\0\0\0\0\0\0\x61\x60\x08\0\0\0\0\0\x18\
+\x61\0\0\0\0\0\0\0\0\0\0\x38\x0b\0\0\x63\x01\0\0\0\0\0\0\x61\x60\x0c\0\0\0\0\0\
+\x18\x61\0\0\0\0\0\0\0\0\0\0\x3c\x0b\0\0\x63\x01\0\0\0\0\0\0\x79\x60\x10\0\0\0\
+\0\0\x18\x61\0\0\0\0\0\0\0\0\0\0\x40\x0b\0\0\x7b\x01\0\0\0\0\0\0\x61\xa0\x78\
+\xff\0\0\0\0\x18\x61\0\0\0\0\0\0\0\0\0\0\x68\x0b\0\0\x63\x01\0\0\0\0\0\0\x18\
+\x61\0\0\0\0\0\0\0\0\0\0\xb0\x0b\0\0\xb7\x02\0\0\x14\0\0\0\xb7\x03\0\0\x0c\0\0\
+\0\xb7\x04\0\0\0\0\0\0\x85\0\0\0\xa7\0\0\0\xbf\x07\0\0\0\0\0\0\xc5\x07\x8e\xff\
+\0\0\0\0\x18\x60\0\0\0\0\0\0\0\0\0\0\x20\x0b\0\0\x63\x70\x6c\0\0\0\0\0\x77\x07\
+\0\0\x20\0\0\0\x63\x70\x70\0\0\0\0\0\xb7\x01\0\0\x05\0\0\0\x18\x62\0\0\0\0\0\0\
+\0\0\0\0\x20\x0b\0\0\xb7\x03\0\0\x8c\0\0\0\x85\0\0\0\xa6\0\0\0\xbf\x07\0\0\0\0\
+\0\0\x18\x60\0\0\0\0\0\0\0\0\0\0\x90\x0b\0\0\x61\x01\0\0\0\0\0\0\xd5\x01\x02\0\
+\0\0\0\0\xbf\x19\0\0\0\0\0\0\x85\0\0\0\xa8\0\0\0\xc5\x07\x7c\xff\0\0\0\0\x63\
+\x7a\x80\xff\0\0\0\0\x18\x60\0\0\0\0\0\0\0\0\0\0\xc8\x0b\0\0\x18\x61\0\0\0\0\0\
+\0\0\0\0\0\x60\x0c\0\0\x7b\x01\0\0\0\0\0\0\x18\x60\0\0\0\0\0\0\0\0\0\0\xd0\x0b\
+\0\0\x18\x61\0\0\0\0\0\0\0\0\0\0\x58\x0c\0\0\x7b\x01\0\0\0\0\0\0\x18\x60\0\0\0\
+\0\0\0\0\0\0\0\x08\x0c\0\0\x18\x61\0\0\0\0\0\0\0\0\0\0\xa0\x0c\0\0\x7b\x01\0\0\
+\0\0\0\0\x18\x60\0\0\0\0\0\0\0\0\0\0\x10\x0c\0\0\x18\x61\0\0\0\0\0\0\0\0\0\0\
+\xb0\x0c\0\0\x7b\x01\0\0\0\0\0\0\x18\x60\0\0\0\0\0\0\0\0\0\0\x40\x0c\0\0\x18\
+\x61\0\0\0\0\0\0\0\0\0\0\xd0\x0c\0\0\x7b\x01\0\0\0\0\0\0\x18\x60\0\0\0\0\0\0\0\
+\0\0\0\0\0\0\0\x18\x61\0\0\0\0\0\0\0\0\0\0\xc8\x0c\0\0\x7b\x01\0\0\0\0\0\0\x61\
+\x60\x08\0\0\0\0\0\x18\x61\0\0\0\0\0\0\0\0\0\0\x68\x0c\0\0\x63\x01\0\0\0\0\0\0\
+\x61\x60\x0c\0\0\0\0\0\x18\x61\0\0\0\0\0\0\0\0\0\0\x6c\x0c\0\0\x63\x01\0\0\0\0\
+\0\0\x79\x60\x10\0\0\0\0\0\x18\x61\0\0\0\0\0\0\0\0\0\0\x70\x0c\0\0\x7b\x01\0\0\
+\0\0\0\0\x61\xa0\x78\xff\0\0\0\0\x18\x61\0\0\0\0\0\0\0\0\0\0\x98\x0c\0\0\x63\
+\x01\0\0\0\0\0\0\x18\x61\0\0\0\0\0\0\0\0\0\0\xe0\x0c\0\0\xb7\x02\0\0\x1e\0\0\0\
+\xb7\x03\0\0\x0c\0\0\0\xb7\x04\0\0\0\0\0\0\x85\0\0\0\xa7\0\0\0\xbf\x07\0\0\0\0\
+\0\0\xc5\x07\x45\xff\0\0\0\0\x18\x60\0\0\0\0\0\0\0\0\0\0\x50\x0c\0\0\x63\x70\
+\x6c\0\0\0\0\0\x77\x07\0\0\x20\0\0\0\x63\x70\x70\0\0\0\0\0\xb7\x01\0\0\x05\0\0\
+\0\x18\x62\0\0\0\0\0\0\0\0\0\0\x50\x0c\0\0\xb7\x03\0\0\x8c\0\0\0\x85\0\0\0\xa6\
+\0\0\0\xbf\x07\0\0\0\0\0\0\x18\x60\0\0\0\0\0\0\0\0\0\0\xc0\x0c\0\0\x61\x01\0\0\
+\0\0\0\0\xd5\x01\x02\0\0\0\0\0\xbf\x19\0\0\0\0\0\0\x85\0\0\0\xa8\0\0\0\xc5\x07\
+\x33\xff\0\0\0\0\x63\x7a\x84\xff\0\0\0\0\x61\xa1\x78\xff\0\0\0\0\xd5\x01\x02\0\
+\0\0\0\0\xbf\x19\0\0\0\0\0\0\x85\0\0\0\xa8\0\0\0\x61\xa0\x80\xff\0\0\0\0\x63\
+\x06\x28\0\0\0\0\0\x61\xa0\x84\xff\0\0\0\0\x63\x06\x2c\0\0\0\0\0\x18\x61\0\0\0\
+\0\0\0\0\0\0\0\0\0\0\0\x61\x10\0\0\0\0\0\0\x63\x06\x18\0\0\0\0\0\xb7\0\0\0\0\0\
+\0\0\x95\0\0\0\0\0\0\0";
+
+ opts.ctx = (struct bpf_loader_ctx *)skel;
+ opts.data_sz = sizeof(opts_data) - 1;
+ opts.data = (void *)opts_data;
+ opts.insns_sz = sizeof(opts_insn) - 1;
+ opts.insns = (void *)opts_insn;
+
err = bpf_load_and_run(&opts);
if (err < 0)
return err;
diff --git a/drivers/hid/bpf/hid_bpf_dispatch.c b/drivers/hid/bpf/hid_bpf_dispatch.c
index e1a650f4a626..afb409a76a63 100644
--- a/drivers/hid/bpf/hid_bpf_dispatch.c
+++ b/drivers/hid/bpf/hid_bpf_dispatch.c
@@ -105,6 +105,25 @@ __weak noinline int hid_bpf_rdesc_fixup(struct hid_bpf_ctx *ctx)
return 0;
}
+/**
+ * hid_bpf_offload - Called when the workqueue attached to the HID device
+ * is triggered
+ *
+ * @ctx: The HID-BPF context
+ *
+ * @return 0 on success and keep processing; a negative error code to interrupt
+ * the processing of this job.
+ *
+ * Declare an %fmod_ret tracing bpf program to this function and attach this
+ * program through hid_bpf_attach_prog() to have this helper called whenever
+ * hid_bpf_schedule_delayed_work() is called.
+ */
+/* never used by the kernel but declared so we can load and attach a tracepoint */
+__weak noinline int hid_bpf_offload(struct hid_bpf_ctx *ctx)
+{
+ return 0;
+}
+
u8 *call_hid_bpf_rdesc_fixup(struct hid_device *hdev, u8 *rdesc, unsigned int *size)
{
int ret;
@@ -203,6 +222,35 @@ int hid_bpf_reconnect(struct hid_device *hdev)
return 0;
}
+static void delayed_work_cb(struct work_struct *work)
+{
+ struct hid_device *hdev = container_of(work, struct hid_device,
+ bpf.work.work);
+ struct hid_bpf_ctx_kern ctx_kern = {
+ .ctx = {
+ .hid = hdev,
+ },
+ };
+ int ret;
+
+ /* no program has been attached yet */
+ if (!hdev->bpf.work_initialized || hdev->bpf.destroyed)
+ return;
+
+ ret = hid_bpf_prog_run(hdev, HID_BPF_PROG_TYPE_OFFLOAD, &ctx_kern);
+ if (ret < 0)
+ hid_warn_once(hdev, "error while executing HID-BPF delayed work: %d", ret);
+}
+
+static void hid_bpf_prep_worker(struct hid_device *hdev)
+{
+ if (hdev->bpf.work_initialized)
+ return;
+
+ INIT_DELAYED_WORK(&hdev->bpf.work, delayed_work_cb);
+ hdev->bpf.work_initialized = true;
+}
+
static int do_hid_bpf_attach_prog(struct hid_device *hdev, int prog_fd, struct bpf_prog *prog,
__u32 flags)
{
@@ -215,10 +263,15 @@ static int do_hid_bpf_attach_prog(struct hid_device *hdev, int prog_fd, struct b
if (prog_type >= HID_BPF_PROG_TYPE_MAX)
return -EINVAL;
- if (prog_type == HID_BPF_PROG_TYPE_DEVICE_EVENT) {
+ switch (prog_type) {
+ case HID_BPF_PROG_TYPE_DEVICE_EVENT:
err = hid_bpf_allocate_event_data(hdev);
if (err)
return err;
+ break;
+ case HID_BPF_PROG_TYPE_OFFLOAD:
+ hid_bpf_prep_worker(hdev);
+ break;
}
fd = __hid_bpf_attach_prog(hdev, prog_type, prog_fd, prog, flags);
@@ -536,6 +589,34 @@ hid_bpf_input_report(struct hid_bpf_ctx *ctx, enum hid_report_type type, u8 *buf
return hid_input_report(hdev, type, buf, size, 0);
}
+
+/**
+ * hid_bpf_schedule_delayed_work - Put work task in the worqueue associated
+ * with the HID device
+ *
+ * @ctx: The HID-BPF context
+ * @delay_ms: a delay in milli seconds, or 0 for immediate execution
+ *
+ * Return: %false if @work was already on a queue, %true otherwise. If
+ * @delay is zero and @work is idle, it will be scheduled for immediate
+ * execution.
+ */
+__bpf_kfunc bool
+hid_bpf_schedule_delayed_work(struct hid_bpf_ctx *ctx, unsigned int delay_ms)
+{
+ struct hid_device *hdev;
+
+ if (!ctx)
+ return -EINVAL;
+
+ hdev = (struct hid_device *)ctx->hid; /* discard const */
+
+ if (!hdev->bpf.work_initialized) {
+ return -EINVAL;
+ }
+
+ return schedule_delayed_work(&hdev->bpf.work, msecs_to_jiffies(delay_ms));
+}
__bpf_kfunc_end_defs();
/*
@@ -544,6 +625,12 @@ __bpf_kfunc_end_defs();
*/
BTF_KFUNCS_START(hid_bpf_kfunc_ids)
BTF_ID_FLAGS(func, hid_bpf_get_data, KF_RET_NULL)
+BTF_ID_FLAGS(func, hid_bpf_schedule_delayed_work)
+BTF_ID_FLAGS(func, hid_bpf_hw_request, KF_SLEEPABLE)
+BTF_ID_FLAGS(func, hid_bpf_hw_output_report, KF_SLEEPABLE)
+BTF_ID_FLAGS(func, hid_bpf_input_report, KF_SLEEPABLE)
+BTF_ID_FLAGS(func, hid_bpf_allocate_context, KF_ACQUIRE | KF_RET_NULL | KF_SLEEPABLE)
+BTF_ID_FLAGS(func, hid_bpf_release_context, KF_RELEASE | KF_SLEEPABLE)
BTF_KFUNCS_END(hid_bpf_kfunc_ids)
static const struct btf_kfunc_id_set hid_bpf_kfunc_set = {
@@ -555,7 +642,9 @@ static const struct btf_kfunc_id_set hid_bpf_kfunc_set = {
BTF_SET8_START(hid_bpf_fmodret_ids)
BTF_ID_FLAGS(func, hid_bpf_device_event)
BTF_ID_FLAGS(func, hid_bpf_rdesc_fixup)
+BTF_ID_FLAGS(func, hid_bpf_offload, KF_SLEEPABLE)
BTF_ID_FLAGS(func, __hid_bpf_tail_call)
+BTF_ID_FLAGS(func, __hid_bpf_tail_call_sleepable, KF_SLEEPABLE)
BTF_SET8_END(hid_bpf_fmodret_ids)
static const struct btf_kfunc_id_set hid_bpf_fmodret_set = {
@@ -610,6 +699,9 @@ void hid_bpf_destroy_device(struct hid_device *hdev)
/* mark the device as destroyed in bpf so we don't reattach it */
hdev->bpf.destroyed = true;
+ if (hdev->bpf.work_initialized)
+ cancel_delayed_work_sync(&hdev->bpf.work);
+
__hid_bpf_destroy_device(hdev);
}
EXPORT_SYMBOL_GPL(hid_bpf_destroy_device);
diff --git a/drivers/hid/bpf/hid_bpf_jmp_table.c b/drivers/hid/bpf/hid_bpf_jmp_table.c
index aa8e1c79cdf5..d401add0e2ab 100644
--- a/drivers/hid/bpf/hid_bpf_jmp_table.c
+++ b/drivers/hid/bpf/hid_bpf_jmp_table.c
@@ -59,6 +59,7 @@ static DECLARE_WORK(release_work, hid_bpf_release_progs);
BTF_ID_LIST(hid_bpf_btf_ids)
BTF_ID(func, hid_bpf_device_event) /* HID_BPF_PROG_TYPE_DEVICE_EVENT */
BTF_ID(func, hid_bpf_rdesc_fixup) /* HID_BPF_PROG_TYPE_RDESC_FIXUP */
+BTF_ID(func, hid_bpf_offload) /* HID_BPF_PROG_TYPE_OFFLOAD */
static int hid_bpf_max_programs(enum hid_bpf_prog_type type)
{
@@ -67,6 +68,8 @@ static int hid_bpf_max_programs(enum hid_bpf_prog_type type)
return HID_BPF_MAX_PROGS_PER_DEV;
case HID_BPF_PROG_TYPE_RDESC_FIXUP:
return 1;
+ case HID_BPF_PROG_TYPE_OFFLOAD:
+ return HID_BPF_MAX_PROGS_PER_DEV;
default:
return -EINVAL;
}
@@ -104,14 +107,30 @@ __weak noinline int __hid_bpf_tail_call(struct hid_bpf_ctx *ctx)
return 0;
}
+__weak noinline int __hid_bpf_tail_call_sleepable(struct hid_bpf_ctx *ctx)
+{
+ return 0;
+}
+
int hid_bpf_prog_run(struct hid_device *hdev, enum hid_bpf_prog_type type,
struct hid_bpf_ctx_kern *ctx_kern)
{
struct hid_bpf_prog_list *prog_list;
int i, idx, err = 0;
- rcu_read_lock();
- prog_list = rcu_dereference(hdev->bpf.progs[type]);
+ if (type == HID_BPF_PROG_TYPE_OFFLOAD) {
+ /*
+ * HID_BPF_PROG_TYPE_OFFLOAD can sleep, so we can not
+ * take an rcu_read_lock()
+ * Prevent concurrency by taking &hid_bpf_attach_lock
+ * instead
+ */
+ mutex_lock(&hid_bpf_attach_lock);
+ prog_list = hdev->bpf.progs[type];
+ } else {
+ rcu_read_lock();
+ prog_list = rcu_dereference(hdev->bpf.progs[type]);
+ }
if (!prog_list)
goto out_unlock;
@@ -123,7 +142,10 @@ int hid_bpf_prog_run(struct hid_device *hdev, enum hid_bpf_prog_type type,
continue;
ctx_kern->ctx.index = idx;
- err = __hid_bpf_tail_call(&ctx_kern->ctx);
+ if (type == HID_BPF_PROG_TYPE_OFFLOAD)
+ err = __hid_bpf_tail_call_sleepable(&ctx_kern->ctx);
+ else
+ err = __hid_bpf_tail_call(&ctx_kern->ctx);
if (err < 0)
break;
if (err)
@@ -131,7 +153,10 @@ int hid_bpf_prog_run(struct hid_device *hdev, enum hid_bpf_prog_type type,
}
out_unlock:
- rcu_read_unlock();
+ if (type == HID_BPF_PROG_TYPE_OFFLOAD)
+ mutex_unlock(&hid_bpf_attach_lock);
+ else
+ rcu_read_unlock();
return err;
}
@@ -557,6 +582,7 @@ int hid_bpf_preload_skel(void)
}
ATTACH_AND_STORE_LINK(hid_tail_call);
+ ATTACH_AND_STORE_LINK(hid_tail_call_sleepable);
return 0;
out:
diff --git a/include/linux/hid_bpf.h b/include/linux/hid_bpf.h
index 17b08f500098..87e72ffadff0 100644
--- a/include/linux/hid_bpf.h
+++ b/include/linux/hid_bpf.h
@@ -5,6 +5,7 @@
#include <linux/bpf.h>
#include <linux/spinlock.h>
+#include <linux/workqueue.h>
#include <uapi/linux/hid.h>
struct hid_device;
@@ -76,6 +77,7 @@ enum hid_bpf_attach_flags {
/* Following functions are tracepoints that BPF programs can attach to */
int hid_bpf_device_event(struct hid_bpf_ctx *ctx);
int hid_bpf_rdesc_fixup(struct hid_bpf_ctx *ctx);
+int hid_bpf_offload(struct hid_bpf_ctx *ctx);
/*
* Below is HID internal
@@ -83,6 +85,7 @@ int hid_bpf_rdesc_fixup(struct hid_bpf_ctx *ctx);
/* internal function to call eBPF programs, not to be used by anybody */
int __hid_bpf_tail_call(struct hid_bpf_ctx *ctx);
+int __hid_bpf_tail_call_sleepable(struct hid_bpf_ctx *ctx);
#define HID_BPF_MAX_PROGS_PER_DEV 64
#define HID_BPF_FLAG_MASK (((HID_BPF_FLAG_MAX - 1) << 1) - 1)
@@ -92,6 +95,7 @@ enum hid_bpf_prog_type {
HID_BPF_PROG_TYPE_UNDEF = -1,
HID_BPF_PROG_TYPE_DEVICE_EVENT, /* an event is emitted from the device */
HID_BPF_PROG_TYPE_RDESC_FIXUP,
+ HID_BPF_PROG_TYPE_OFFLOAD,
HID_BPF_PROG_TYPE_MAX,
};
@@ -129,6 +133,9 @@ struct hid_bpf {
bool destroyed; /* prevents the assignment of any progs */
spinlock_t progs_lock; /* protects RCU update of progs */
+
+ struct delayed_work work;
+ bool work_initialized;
};
/* specific HID-BPF link when a program is attached to a device */
--
2.43.0
These ones are a little bit more complex, but allows to check whether
sleepable tracing functions can be called:
- one event is injected and then pushed into the queue map
- optionally another event gets injected in the queue map
- the events in the queue are then re-injected in the HID stack
- if there is an error, while re-injecting it, we try again 5 ms later
- ensure we can add another event in the queue or call other sleepable
kfuncs
- ensure we receive the correct events and exactly them
Signed-off-by: Benjamin Tissoires <[email protected]>
---
tools/testing/selftests/hid/hid_bpf.c | 117 +++++++++++++++++++++++++++++++-
tools/testing/selftests/hid/progs/hid.c | 104 ++++++++++++++++++++++++++++
2 files changed, 220 insertions(+), 1 deletion(-)
diff --git a/tools/testing/selftests/hid/hid_bpf.c b/tools/testing/selftests/hid/hid_bpf.c
index bb95ff90951b..cfa42d603acb 100644
--- a/tools/testing/selftests/hid/hid_bpf.c
+++ b/tools/testing/selftests/hid/hid_bpf.c
@@ -460,7 +460,7 @@ FIXTURE(hid_bpf) {
int hid_id;
pthread_t tid;
struct hid *skel;
- int hid_links[3]; /* max number of programs loaded in a single test */
+ int hid_links[4]; /* max number of programs loaded in a single test */
};
static void detach_bpf(FIXTURE_DATA(hid_bpf) * self)
{
@@ -932,6 +932,121 @@ TEST_F(hid_bpf, test_hid_schedule_work)
ASSERT_EQ(workload_data, 53);
}
+/*
+ * Call hid_bpf_schedule_delayed_work against the given uhid device,
+ * ensure we can inject events (call a sleepable tracing function),
+ * check that the program is called and does the expected.
+ */
+TEST_F(hid_bpf, test_hid_schedule_work_defer_events)
+{
+ const struct test_program progs[] = {
+ { .name = "hid_defer_event" },
+ { .name = "hid_offload_inject", .insert_head = 1 },
+ { .name = "hid_offload_multiply_events" },
+ { .name = "hid_offload_notify" },
+ };
+ struct ring_buffer *rb = NULL;
+ __u8 buf[10] = {0};
+ __u32* delay;
+ int err;
+
+ LOAD_PROGRAMS(progs);
+
+ /* Set up ring buffer polling */
+ rb = ring_buffer__new(bpf_map__fd(self->skel->maps.rb), handle_event, NULL, NULL);
+ ASSERT_OK_PTR(rb) TH_LOG("Failed to create ring buffer");
+ ASSERT_EQ(workload_data, 0);
+
+ delay = (__u32 *)&buf[2];
+
+ /* inject one event */
+ buf[0] = 1;
+ buf[1] = 42; /* this will be placed in the ring buffer */
+ *delay = 0;
+ uhid_send_event(_metadata, self->uhid_fd, buf, 6);
+
+ err = ring_buffer__poll(rb, 100 /* timeout, ms */);
+ ASSERT_EQ(err, 1) TH_LOG("error while calling ring_buffer__poll");
+
+ ASSERT_EQ(workload_data, 42);
+
+ err = ring_buffer__poll(rb, 1000 /* timeout, ms */);
+ ASSERT_EQ(err, 1) TH_LOG("error while calling ring_buffer__poll");
+ ASSERT_EQ(workload_data, 52);
+
+ /* read twice the data from hidraw */
+ memset(buf, 0, sizeof(buf));
+ err = read(self->hidraw_fd, buf, sizeof(buf));
+ ASSERT_EQ(err, 6) TH_LOG("read_hidraw");
+ ASSERT_EQ(buf[0], 2);
+ ASSERT_EQ(buf[1], 42);
+
+ memset(buf, 0, sizeof(buf));
+ err = read(self->hidraw_fd, buf, sizeof(buf));
+ ASSERT_EQ(err, 6) TH_LOG("read_hidraw");
+ ASSERT_EQ(buf[0], 2);
+ ASSERT_EQ(buf[1], 52);
+
+ memset(buf, 0, sizeof(buf));
+ err = read(self->hidraw_fd, buf, sizeof(buf));
+ ASSERT_EQ(err, -1) TH_LOG("read_hidraw: too many events");
+}
+
+TEST_F(hid_bpf, test_hid_schedule_work_defer_events_2)
+{
+ const struct test_program progs[] = {
+ { .name = "hid_defer_multiple_events" },
+ { .name = "hid_offload_inject", .insert_head = 1 },
+ { .name = "hid_offload_hw_request" },
+ { .name = "hid_offload_notify" },
+ };
+ struct ring_buffer *rb = NULL;
+ __u8 buf[10] = {0};
+ int err;
+
+ LOAD_PROGRAMS(progs);
+
+ /* Set up ring buffer polling */
+ rb = ring_buffer__new(bpf_map__fd(self->skel->maps.rb), handle_event, NULL, NULL);
+ ASSERT_OK_PTR(rb) TH_LOG("Failed to create ring buffer");
+ ASSERT_EQ(workload_data, 0);
+
+ /* inject one event */
+ buf[0] = 1;
+ buf[1] = 47;
+ buf[2] = 50;
+ uhid_send_event(_metadata, self->uhid_fd, buf, 6);
+
+ err = ring_buffer__poll(rb, 100 /* timeout, ms */);
+ ASSERT_EQ(err, 1) TH_LOG("error while calling ring_buffer__poll");
+ ASSERT_EQ(workload_data, 3);
+
+ /* read the data from hidraw */
+ memset(buf, 0, sizeof(buf));
+ err = read(self->hidraw_fd, buf, sizeof(buf));
+ ASSERT_EQ(err, 6) TH_LOG("read_hidraw");
+ ASSERT_EQ(buf[0], 2);
+ ASSERT_EQ(buf[1], 3);
+ ASSERT_EQ(buf[2], 4) TH_LOG("leftovers_from_previous_test");
+
+ err = ring_buffer__poll(rb, 100 /* timeout, ms */);
+ ASSERT_EQ(err, 1) TH_LOG("error while calling ring_buffer__poll");
+ ASSERT_EQ(workload_data, 4);
+
+ /* read the data from hidraw */
+ memset(buf, 0, sizeof(buf));
+ err = read(self->hidraw_fd, buf, sizeof(buf));
+ ASSERT_EQ(err, 6) TH_LOG("read_hidraw");
+ ASSERT_EQ(buf[0], 2);
+ ASSERT_EQ(buf[1], 4);
+ ASSERT_EQ(buf[2], 6);
+
+ /* read the data from hidraw */
+ memset(buf, 0, sizeof(buf));
+ err = read(self->hidraw_fd, buf, sizeof(buf));
+ ASSERT_EQ(err, -1) TH_LOG("read_hidraw");
+}
+
/*
* Attach hid_insert{0,1,2} to the given uhid device,
* retrieve and open the matching hidraw node,
diff --git a/tools/testing/selftests/hid/progs/hid.c b/tools/testing/selftests/hid/progs/hid.c
index 95a03fb0494a..aae8d7a0699e 100644
--- a/tools/testing/selftests/hid/progs/hid.c
+++ b/tools/testing/selftests/hid/progs/hid.c
@@ -286,6 +286,63 @@ int BPF_PROG(hid_offload_notify, struct hid_bpf_ctx *hid_ctx)
return 0;
}
+SEC("?fmod_ret.s/hid_bpf_offload")
+int BPF_PROG(hid_offload_multiply_events, struct hid_bpf_ctx *hid_ctx)
+{
+ struct test_report buf;
+ int err;
+
+ /* do not pop the event, it'll be done in hid_offload_test() when
+ * notifying user space, this also allows to retry sending it
+ * if hid_bpf_input_report fails */
+ if (bpf_map_peek_elem(&queue, &buf))
+ return 0;
+
+ buf.data[1] += 10;
+ /* inject another event to be processed */
+ if (buf.data[1] < 60)
+ bpf_map_push_elem(&queue, &buf, BPF_ANY);
+
+ return 0;
+}
+
+SEC("?fmod_ret.s/hid_bpf_offload")
+int BPF_PROG(hid_offload_inject, struct hid_bpf_ctx *hid_ctx)
+{
+ struct test_report buf;
+ int err;
+
+ /* do not pop the event, it'll be done in hid_offload_test() when
+ * notifying user space, this also allows to retry sending it
+ * if hid_bpf_input_report fails */
+ if (bpf_map_peek_elem(&queue, &buf))
+ return 0;
+
+ buf.data[0] = 2;
+
+ /* re-inject the modified event into the HID stack */
+ err = hid_bpf_input_report(hid_ctx, HID_INPUT_REPORT, buf.data, sizeof(buf.data));
+ if (err == -16 /* -EBUSY */) {
+ /*
+ * This happens when we schedule the work with a 0 delay:
+ * the thread immediately starts but the current input
+ * processing hasn't finished yet. So the semaphore is
+ * already taken, and hid_input_report returns -EBUSY
+ */
+ /* schedule another attempt */
+ hid_bpf_schedule_delayed_work(hid_ctx, 5);
+
+ /* return an error so that we don't trigger hid_offload_test()
+ * and pop the element */
+ return err;
+ }
+
+ /* call ourself once again until there is no more events in the queue */
+ hid_bpf_schedule_delayed_work(hid_ctx, 5);
+
+ return 0;
+}
+
SEC("?fmod_ret/hid_bpf_device_event")
int BPF_PROG(hid_defer_event, struct hid_bpf_ctx *hctx)
{
@@ -296,6 +353,11 @@ int BPF_PROG(hid_defer_event, struct hid_bpf_ctx *hctx)
if (!data)
return 0; /* EPERM check */
+ /* Only schedule a delayed work when reportID is 1, otherwise
+ * simply forward it to hidraw */
+ if (data[0] != 1)
+ return 0;
+
__builtin_memcpy(&buf.data, data, 6);
delay = *(__u32 *)&data[2];
@@ -305,3 +367,45 @@ int BPF_PROG(hid_defer_event, struct hid_bpf_ctx *hctx)
return -1; /* discard the event */
}
+
+SEC("?fmod_ret.s/hid_bpf_offload")
+int BPF_PROG(hid_offload_hw_request, struct hid_bpf_ctx *hid_ctx)
+{
+ struct test_report buf;
+ __u8 data[6] = {1};
+ int ret;
+
+ ret = hid_bpf_hw_request(hid_ctx,
+ data,
+ sizeof(data),
+ HID_INPUT_REPORT,
+ HID_REQ_GET_REPORT);
+
+ return 0;
+}
+SEC("?fmod_ret/hid_bpf_device_event")
+int BPF_PROG(hid_defer_multiple_events, struct hid_bpf_ctx *hctx)
+{
+ __u8 *data = hid_bpf_get_data(hctx, 0 /* offset */, 4 /* size */);
+ struct test_report buf = {
+ .data = {2, 3, 4, 5, 6, 7},
+ };
+
+ if (!data)
+ return 0; /* EPERM check */
+
+ /* Only schedule a delayed work when reportID is 1, otherwise
+ * simply forward it to hidraw */
+ if (data[0] != 1)
+ return 0;
+
+ bpf_map_push_elem(&queue, &buf, BPF_ANY);
+ buf.data[0] = 2;
+ buf.data[1] = 4;
+ buf.data[2] = 6;
+ bpf_map_push_elem(&queue, &buf, BPF_ANY);
+
+ hid_bpf_schedule_delayed_work(hctx, 10);
+
+ return -1; /* discard the event */
+}
--
2.43.0
Benjamin Tissoires <[email protected]> writes:
> [Putting this as a RFC because I'm pretty sure I'm not doing the things
> correctly at the BPF level.]
> [Also using bpf-next as the base tree as there will be conflicting
> changes otherwise]
>
> Ideally I'd like to have something similar to bpf_timers, but not
> in soft IRQ context. So I'm emulating this with a sleepable
> bpf_tail_call() (see "HID: bpf: allow to defer work in a delayed
> workqueue").
Why implement a new mechanism? Sounds like what you need is essentially
the bpf_timer functionality, just running in a different context, right?
So why not just add a flag to the timer setup that controls the callback
context? I've been toying with something similar for restarting XDP TX
for my queueing patch series (though I'm not sure if this will actually
end up being needed in the end):
https://git.kernel.org/pub/scm/linux/kernel/git/toke/linux.git/commit/?h=xdp-queueing-08&id=54bc201a358d1ac6ebfe900099315bbd0a76e862
-Toke
On Fri, Feb 9, 2024 at 4:42 PM Toke Høiland-Jørgensen <[email protected]> wrote:
>
> Benjamin Tissoires <[email protected]> writes:
>
> > [Putting this as a RFC because I'm pretty sure I'm not doing the things
> > correctly at the BPF level.]
> > [Also using bpf-next as the base tree as there will be conflicting
> > changes otherwise]
> >
> > Ideally I'd like to have something similar to bpf_timers, but not
> > in soft IRQ context. So I'm emulating this with a sleepable
> > bpf_tail_call() (see "HID: bpf: allow to defer work in a delayed
> > workqueue").
>
> Why implement a new mechanism? Sounds like what you need is essentially
> the bpf_timer functionality, just running in a different context, right?
Heh, that's exactly why I put in a RFC :)
So yes, the bpf_timer approach is cleaner, but I need it in a
workqueue, as a hrtimer in a softIRQ would prevent me to kzalloc and
wait for the device.
> So why not just add a flag to the timer setup that controls the callback
> context? I've been toying with something similar for restarting XDP TX
> for my queueing patch series (though I'm not sure if this will actually
> end up being needed in the end):
>
> https://git.kernel.org/pub/scm/linux/kernel/git/toke/linux.git/commit/?h=xdp-queueing-08&id=54bc201a358d1ac6ebfe900099315bbd0a76e862
>
Oh, nice. Good idea. But would it be OK to have a "timer-like" where
it actually defers the job in a workqueue instead of using an hrtimer?
I thought I would have to rewrite the entire bpf_timer approach
without the softIRQ, but if I can just add a new flag, that will make
things way simpler for me.
This however raises another issue if I were to use the bpf_timers: now
the HID-BPF kfuncs will not be available as they are only available to
tracing prog types. And when I tried to call them from a bpf_timer (in
softIRQ) they were not available.
Cheers,
Benjamin
Benjamin Tissoires <[email protected]> writes:
> On Fri, Feb 9, 2024 at 4:42 PM Toke Høiland-Jørgensen <[email protected]> wrote:
>>
>> Benjamin Tissoires <[email protected]> writes:
>>
>> > [Putting this as a RFC because I'm pretty sure I'm not doing the things
>> > correctly at the BPF level.]
>> > [Also using bpf-next as the base tree as there will be conflicting
>> > changes otherwise]
>> >
>> > Ideally I'd like to have something similar to bpf_timers, but not
>> > in soft IRQ context. So I'm emulating this with a sleepable
>> > bpf_tail_call() (see "HID: bpf: allow to defer work in a delayed
>> > workqueue").
>>
>> Why implement a new mechanism? Sounds like what you need is essentially
>> the bpf_timer functionality, just running in a different context, right?
>
> Heh, that's exactly why I put in a RFC :)
>
> So yes, the bpf_timer approach is cleaner, but I need it in a
> workqueue, as a hrtimer in a softIRQ would prevent me to kzalloc and
> wait for the device.
Right, makes sense.
>> So why not just add a flag to the timer setup that controls the callback
>> context? I've been toying with something similar for restarting XDP TX
>> for my queueing patch series (though I'm not sure if this will actually
>> end up being needed in the end):
>>
>> https://git.kernel.org/pub/scm/linux/kernel/git/toke/linux.git/commit/?h=xdp-queueing-08&id=54bc201a358d1ac6ebfe900099315bbd0a76e862
>>
>
> Oh, nice. Good idea. But would it be OK to have a "timer-like" where
> it actually defers the job in a workqueue instead of using an hrtimer?
That's conceptually still a timer, though, isn't it? I.e., it's a
mechanism whereby you specify a callback and a delay, and bpf_timer
ensures that your callback is called after that delay. IMO it's totally
congruent with that API to be able to specify a different execution
context as part of the timer setup.
As for how to implement it, I suspect the easiest may be something
similar to what the patch I linked above does: keep the hrtimer, and
just have a different (kernel) callback function when the timer fires
which does an immediate schedule_work() (without the _delayed) and then
runs the BPF callback in that workqueue. I.e., keep the delay handling
the way the existing bpf_timer implementation does it, and just add an
indirection to start the workqueue in the kernel dispatch code.
> I thought I would have to rewrite the entire bpf_timer approach
> without the softIRQ, but if I can just add a new flag, that will make
> things way simpler for me.
IMO that would be fine. You may want to wait for the maintainers to
chime in before going down this route, though :)
> This however raises another issue if I were to use the bpf_timers: now
> the HID-BPF kfuncs will not be available as they are only available to
> tracing prog types. And when I tried to call them from a bpf_timer (in
> softIRQ) they were not available.
IIUC, the bpf_timer callback is just a function (subprog) from the
verifier PoV, so it is verified as whatever program type is creating the
timer. So in other words, as long as you setup the timer from inside a
tracing prog type, you should have access to all the same kfuncs, I
think?
-Toke
On Fri, Feb 9, 2024 at 6:05 PM Toke Høiland-Jørgensen <[email protected]> wrote:
>
> Benjamin Tissoires <[email protected]> writes:
>
> > On Fri, Feb 9, 2024 at 4:42 PM Toke Høiland-Jørgensen <[email protected]> wrote:
> >>
> >> Benjamin Tissoires <[email protected]> writes:
> >>
> >> > [Putting this as a RFC because I'm pretty sure I'm not doing the things
> >> > correctly at the BPF level.]
> >> > [Also using bpf-next as the base tree as there will be conflicting
> >> > changes otherwise]
> >> >
> >> > Ideally I'd like to have something similar to bpf_timers, but not
> >> > in soft IRQ context. So I'm emulating this with a sleepable
> >> > bpf_tail_call() (see "HID: bpf: allow to defer work in a delayed
> >> > workqueue").
> >>
> >> Why implement a new mechanism? Sounds like what you need is essentially
> >> the bpf_timer functionality, just running in a different context, right?
> >
> > Heh, that's exactly why I put in a RFC :)
> >
> > So yes, the bpf_timer approach is cleaner, but I need it in a
> > workqueue, as a hrtimer in a softIRQ would prevent me to kzalloc and
> > wait for the device.
>
> Right, makes sense.
>
> >> So why not just add a flag to the timer setup that controls the callback
> >> context? I've been toying with something similar for restarting XDP TX
> >> for my queueing patch series (though I'm not sure if this will actually
> >> end up being needed in the end):
> >>
> >> https://git.kernel.org/pub/scm/linux/kernel/git/toke/linux.git/commit/?h=xdp-queueing-08&id=54bc201a358d1ac6ebfe900099315bbd0a76e862
> >>
> >
> > Oh, nice. Good idea. But would it be OK to have a "timer-like" where
> > it actually defers the job in a workqueue instead of using an hrtimer?
>
> That's conceptually still a timer, though, isn't it? I.e., it's a
> mechanism whereby you specify a callback and a delay, and bpf_timer
> ensures that your callback is called after that delay. IMO it's totally
> congruent with that API to be able to specify a different execution
> context as part of the timer setup.
Yep :)
There is still a problem I wasn't able to fix over the week end and
today. How can I tell the verifier that the callback is sleepable,
when the tracing function that called the timer_start() function is
not?
(more on that below).
>
> As for how to implement it, I suspect the easiest may be something
> similar to what the patch I linked above does: keep the hrtimer, and
> just have a different (kernel) callback function when the timer fires
> which does an immediate schedule_work() (without the _delayed) and then
> runs the BPF callback in that workqueue. I.e., keep the delay handling
> the way the existing bpf_timer implementation does it, and just add an
> indirection to start the workqueue in the kernel dispatch code.
Sounds good, especially given that's roughly how the delayed_timers
are implemented.
>
> > I thought I would have to rewrite the entire bpf_timer approach
> > without the softIRQ, but if I can just add a new flag, that will make
> > things way simpler for me.
>
> IMO that would be fine. You may want to wait for the maintainers to
> chime in before going down this route, though :)
>
> > This however raises another issue if I were to use the bpf_timers: now
> > the HID-BPF kfuncs will not be available as they are only available to
> > tracing prog types. And when I tried to call them from a bpf_timer (in
> > softIRQ) they were not available.
>
> IIUC, the bpf_timer callback is just a function (subprog) from the
> verifier PoV, so it is verified as whatever program type is creating the
> timer. So in other words, as long as you setup the timer from inside a
> tracing prog type, you should have access to all the same kfuncs, I
> think?
Yep, you are correct. But as mentioned above, I am now in trouble with
the sleepable state:
- I need to call timer_start() from a non sleepable tracing function
(I'm in hard IRQ when dealing with a physical device)
- but then, ideally, the callback function needs to be tagged as a
sleepable one, so I can export my kfuncs which are doing kzalloc and
device IO as such.
However, I can not really teach the BPF verifier to do so:
- it seems to check for the callback first when it is loaded, and
there is no SEC() equivalent for static functions
- libbpf doesn't have access to the callback as a prog as it has to be
a static function, and thus isn't exported as a full-blown prog.
- the verifier only checks for the callback when dealing with
BPF_FUNC_timer_set_callback, which doesn't have a "flag" argument
(though the validation of the callback has already been done while
checking it first, so we are already too late to change the sleppable
state of the callback)
Right now, the only OK-ish version I have is declaring the kfunc as
non-sleepable, but checking that we are in a different context than
the IRQ of the initial event. This way, I am not crashing if this
function is called from the initial IRQ, but will still crash if used
outside of the hid context.
This is not satisfactory, but I feel like it's going to be hard to
teach the verifier that the callback function is sleepable in that
case (maybe we could suffix the callback name, like we do for
arguments, but this is not very clean either).
Cheers,
Benjamin
Benjamin Tissoires <[email protected]> writes:
[...]
>> IIUC, the bpf_timer callback is just a function (subprog) from the
>> verifier PoV, so it is verified as whatever program type is creating the
>> timer. So in other words, as long as you setup the timer from inside a
>> tracing prog type, you should have access to all the same kfuncs, I
>> think?
>
> Yep, you are correct. But as mentioned above, I am now in trouble with
> the sleepable state:
> - I need to call timer_start() from a non sleepable tracing function
> (I'm in hard IRQ when dealing with a physical device)
> - but then, ideally, the callback function needs to be tagged as a
> sleepable one, so I can export my kfuncs which are doing kzalloc and
> device IO as such.
>
> However, I can not really teach the BPF verifier to do so:
> - it seems to check for the callback first when it is loaded, and
> there is no SEC() equivalent for static functions
> - libbpf doesn't have access to the callback as a prog as it has to be
> a static function, and thus isn't exported as a full-blown prog.
> - the verifier only checks for the callback when dealing with
> BPF_FUNC_timer_set_callback, which doesn't have a "flag" argument
> (though the validation of the callback has already been done while
> checking it first, so we are already too late to change the sleppable
> state of the callback)
>
> Right now, the only OK-ish version I have is declaring the kfunc as
> non-sleepable, but checking that we are in a different context than
> the IRQ of the initial event. This way, I am not crashing if this
> function is called from the initial IRQ, but will still crash if used
> outside of the hid context.
>
> This is not satisfactory, but I feel like it's going to be hard to
> teach the verifier that the callback function is sleepable in that
> case (maybe we could suffix the callback name, like we do for
> arguments, but this is not very clean either).
The callback is only set once when the timer is first setup; I *think*
it works to do the setup (bpf_timer_init() and bpf_timer_set_callback())
in the context you need (from a sleepable prog), but do the arming
(bpf_timer_start()) from a different program that is not itself sleepable?
-Toke
On Mon, Feb 12, 2024 at 6:46 PM Toke Høiland-Jørgensen <[email protected]> wrote:
>
> Benjamin Tissoires <[email protected]> writes:
>
> [...]
> >> IIUC, the bpf_timer callback is just a function (subprog) from the
> >> verifier PoV, so it is verified as whatever program type is creating the
> >> timer. So in other words, as long as you setup the timer from inside a
> >> tracing prog type, you should have access to all the same kfuncs, I
> >> think?
> >
> > Yep, you are correct. But as mentioned above, I am now in trouble with
> > the sleepable state:
> > - I need to call timer_start() from a non sleepable tracing function
> > (I'm in hard IRQ when dealing with a physical device)
> > - but then, ideally, the callback function needs to be tagged as a
> > sleepable one, so I can export my kfuncs which are doing kzalloc and
> > device IO as such.
> >
> > However, I can not really teach the BPF verifier to do so:
> > - it seems to check for the callback first when it is loaded, and
> > there is no SEC() equivalent for static functions
> > - libbpf doesn't have access to the callback as a prog as it has to be
> > a static function, and thus isn't exported as a full-blown prog.
> > - the verifier only checks for the callback when dealing with
> > BPF_FUNC_timer_set_callback, which doesn't have a "flag" argument
> > (though the validation of the callback has already been done while
> > checking it first, so we are already too late to change the sleppable
> > state of the callback)
> >
> > Right now, the only OK-ish version I have is declaring the kfunc as
> > non-sleepable, but checking that we are in a different context than
> > the IRQ of the initial event. This way, I am not crashing if this
> > function is called from the initial IRQ, but will still crash if used
> > outside of the hid context.
> >
> > This is not satisfactory, but I feel like it's going to be hard to
> > teach the verifier that the callback function is sleepable in that
> > case (maybe we could suffix the callback name, like we do for
> > arguments, but this is not very clean either).
>
> The callback is only set once when the timer is first setup; I *think*
> it works to do the setup (bpf_timer_init() and bpf_timer_set_callback())
> in the context you need (from a sleepable prog), but do the arming
> (bpf_timer_start()) from a different program that is not itself sleepable?
>
Genius! It works, and I can just keep having them declared as a
syscall kfunc, not as a tracing kfunc.
But isn't this an issue outside of my use case? I mean, if the
callback is assuming the environment for when it is set up but can be
called from any context there seems to be a problem when 2 contexts
are not equivalent, no?
On Mon, Feb 12, 2024 at 10:21 AM Benjamin Tissoires
<[email protected]> wrote:
>
> On Mon, Feb 12, 2024 at 6:46 PM Toke Høiland-Jørgensen <[email protected]> wrote:
> >
> > Benjamin Tissoires <[email protected]> writes:
> >
> > [...]
> > >> IIUC, the bpf_timer callback is just a function (subprog) from the
> > >> verifier PoV, so it is verified as whatever program type is creating the
> > >> timer. So in other words, as long as you setup the timer from inside a
> > >> tracing prog type, you should have access to all the same kfuncs, I
> > >> think?
> > >
> > > Yep, you are correct. But as mentioned above, I am now in trouble with
> > > the sleepable state:
> > > - I need to call timer_start() from a non sleepable tracing function
> > > (I'm in hard IRQ when dealing with a physical device)
> > > - but then, ideally, the callback function needs to be tagged as a
> > > sleepable one, so I can export my kfuncs which are doing kzalloc and
> > > device IO as such.
> > >
> > > However, I can not really teach the BPF verifier to do so:
> > > - it seems to check for the callback first when it is loaded, and
> > > there is no SEC() equivalent for static functions
> > > - libbpf doesn't have access to the callback as a prog as it has to be
> > > a static function, and thus isn't exported as a full-blown prog.
> > > - the verifier only checks for the callback when dealing with
> > > BPF_FUNC_timer_set_callback, which doesn't have a "flag" argument
> > > (though the validation of the callback has already been done while
> > > checking it first, so we are already too late to change the sleppable
> > > state of the callback)
> > >
> > > Right now, the only OK-ish version I have is declaring the kfunc as
> > > non-sleepable, but checking that we are in a different context than
> > > the IRQ of the initial event. This way, I am not crashing if this
> > > function is called from the initial IRQ, but will still crash if used
> > > outside of the hid context.
> > >
> > > This is not satisfactory, but I feel like it's going to be hard to
> > > teach the verifier that the callback function is sleepable in that
> > > case (maybe we could suffix the callback name, like we do for
> > > arguments, but this is not very clean either).
> >
> > The callback is only set once when the timer is first setup; I *think*
> > it works to do the setup (bpf_timer_init() and bpf_timer_set_callback())
> > in the context you need (from a sleepable prog), but do the arming
> > (bpf_timer_start()) from a different program that is not itself sleepable?
> >
>
> Genius! It works, and I can just keep having them declared as a
> syscall kfunc, not as a tracing kfunc.
>
> But isn't this an issue outside of my use case? I mean, if the
> callback is assuming the environment for when it is set up but can be
> called from any context there seems to be a problem when 2 contexts
> are not equivalent, no?
I agree that workqueue delegation fits into the bpf_timer concept and
a lot of code can and should be shared.
All the lessons(bugs) learned with bpf_timer don't need to be re-discovered :)
Too bad, bpf_timer_set_callback() doesn't have a flag argument,
so we need a new kfunc to set a sleepable callback.
Maybe
bpf_timer_set_sleepable_cb() ?
The verifier will set is_async_cb = true for it (like it does for regular cb-s).
And since prog->aux->sleepable is kinda "global" we need another
per subprog flag:
bool is_sleepable: 1;
We can factor out a check "if (prog->aux->sleepable)" into a helper
that will check that "global" flag and another env->cur_state->in_sleepable
flag that will work similar to active_rcu_lock.
Once the verifier starts processing subprog->is_sleepable
it will set cur_state->in_sleepable = true;
to make all subprogs called from that cb to be recognized as sleepable too.
A bit of a challenge is what to do with global subprogs,
since they're verified lazily. They can be called from
sleepable and non-sleepable contex. Should be solvable.
Overall I think this feature is needed urgently,
so if you don't have cycles to work on this soon,
I can prioritize it right after bpf_arena work.
On Feb 12 2024, Alexei Starovoitov wrote:
> On Mon, Feb 12, 2024 at 10:21 AM Benjamin Tissoires
> <[email protected]> wrote:
> >
> > On Mon, Feb 12, 2024 at 6:46 PM Toke Høiland-Jørgensen <[email protected]> wrote:
> > >
> > > Benjamin Tissoires <[email protected]> writes:
> > >
[...]
> I agree that workqueue delegation fits into the bpf_timer concept and
> a lot of code can and should be shared.
Thanks Alexei for the detailed answer. I've given it an attempt but still can not
figure it out entirely.
> All the lessons(bugs) learned with bpf_timer don't need to be re-discovered :)
> Too bad, bpf_timer_set_callback() doesn't have a flag argument,
> so we need a new kfunc to set a sleepable callback.
> Maybe
> bpf_timer_set_sleepable_cb() ?
OK. So I guess I should drop Toke's suggestion with the bpf_timer_ini() flag?
> The verifier will set is_async_cb = true for it (like it does for regular cb-s).
> And since prog->aux->sleepable is kinda "global" we need another
> per subprog flag:
> bool is_sleepable: 1;
done (in push_callback_call())
>
> We can factor out a check "if (prog->aux->sleepable)" into a helper
> that will check that "global" flag and another env->cur_state->in_sleepable
> flag that will work similar to active_rcu_lock.
done (I think), cf patch 2 below
> Once the verifier starts processing subprog->is_sleepable
> it will set cur_state->in_sleepable = true;
> to make all subprogs called from that cb to be recognized as sleepable too.
That's the point I don't know where to put the new code.
It seems the best place would be in do_check(), but I am under the impression
that the code of the callback is added at the end of the instruction list, meaning
that I do not know where it starts, and which subprog index it corresponds to.
>
> A bit of a challenge is what to do with global subprogs,
> since they're verified lazily. They can be called from
> sleepable and non-sleepable contex. Should be solvable.
I must confess this is way over me (and given that I didn't even managed to make
the "easy" case working, that might explain things a little :-P )
>
> Overall I think this feature is needed urgently,
> so if you don't have cycles to work on this soon,
> I can prioritize it right after bpf_arena work.
I can try to spare a few cycles on it. Even if your instructions were on
spot, I still can't make the subprogs recognized as sleepable.
For reference, this is where I am (probably bogus, but seems to be
working when timer_set_sleepable_cb() is called from a sleepable context
as mentioned by Toke):
---
From d4aa3d969fa9a89c6447d843dad338fde2ac0155 Mon Sep 17 00:00:00 2001
From: Benjamin Tissoires <[email protected]>
Date: Tue, 13 Feb 2024 18:40:01 +0100
Subject: [PATCH RFC bpf-next v2 01/11] Sleepable timers
MIME-Version: 1.0
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: 7bit
Message-Id: <[email protected]>
---
include/linux/bpf_verifier.h | 2 +
include/uapi/linux/bpf.h | 13 ++++++
kernel/bpf/helpers.c | 91 +++++++++++++++++++++++++++++++++++++++---
kernel/bpf/verifier.c | 20 ++++++++--
tools/include/uapi/linux/bpf.h | 13 ++++++
5 files changed, 130 insertions(+), 9 deletions(-)
diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h
index 84365e6dd85d..789ef5fec547 100644
--- a/include/linux/bpf_verifier.h
+++ b/include/linux/bpf_verifier.h
@@ -426,6 +426,7 @@ struct bpf_verifier_state {
* while they are still in use.
*/
bool used_as_loop_entry;
+ bool in_sleepable;
/* first and last insn idx of this verifier state */
u32 first_insn_idx;
@@ -626,6 +627,7 @@ struct bpf_subprog_info {
bool is_async_cb: 1;
bool is_exception_cb: 1;
bool args_cached: 1;
+ bool is_sleepable: 1;
u8 arg_cnt;
struct bpf_subprog_arg_info args[MAX_BPF_FUNC_REG_ARGS];
diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index d96708380e52..ef1f2be4cfef 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -5742,6 +5742,18 @@ union bpf_attr {
* 0 on success.
*
* **-ENOENT** if the bpf_local_storage cannot be found.
+ *
+ * long bpf_timer_set_sleepable_cb(struct bpf_timer *timer, void *callback_fn)
+ * Description
+ * Configure the timer to call *callback_fn* static function in a
+ * sleepable context.
+ * Return
+ * 0 on success.
+ * **-EINVAL** if *timer* was not initialized with bpf_timer_init() earlier.
+ * **-EPERM** if *timer* is in a map that doesn't have any user references.
+ * The user space should either hold a file descriptor to a map with timers
+ * or pin such map in bpffs. When map is unpinned or file descriptor is
+ * closed all timers in the map will be cancelled and freed.
*/
#define ___BPF_FUNC_MAPPER(FN, ctx...) \
FN(unspec, 0, ##ctx) \
@@ -5956,6 +5968,7 @@ union bpf_attr {
FN(user_ringbuf_drain, 209, ##ctx) \
FN(cgrp_storage_get, 210, ##ctx) \
FN(cgrp_storage_delete, 211, ##ctx) \
+ FN(timer_set_sleepable_cb, 212, ##ctx) \
/* */
/* backwards-compatibility macros for users of __BPF_FUNC_MAPPER that don't
diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c
index 4db1c658254c..e3b83d27b1b6 100644
--- a/kernel/bpf/helpers.c
+++ b/kernel/bpf/helpers.c
@@ -1097,9 +1097,11 @@ const struct bpf_func_proto bpf_snprintf_proto = {
*/
struct bpf_hrtimer {
struct hrtimer timer;
+ struct work_struct work;
struct bpf_map *map;
struct bpf_prog *prog;
void __rcu *callback_fn;
+ void __rcu *sleepable_cb_fn;
void *value;
};
@@ -1113,18 +1115,64 @@ struct bpf_timer_kern {
struct bpf_spin_lock lock;
} __attribute__((aligned(8)));
+static void bpf_timer_work_cb(struct work_struct *work)
+{
+ struct bpf_hrtimer *t = container_of(work, struct bpf_hrtimer, work);
+ struct bpf_map *map = t->map;
+ void *value = t->value;
+ bpf_callback_t callback_fn;
+ void *key;
+ u32 idx;
+
+ BTF_TYPE_EMIT(struct bpf_timer);
+
+ rcu_read_lock();
+ callback_fn = rcu_dereference(t->sleepable_cb_fn);
+ rcu_read_unlock();
+ if (!callback_fn)
+ return;
+
+ // /* bpf_timer_work_cb() runs in hrtimer_run_softirq. It doesn't migrate and
+ // * cannot be preempted by another bpf_timer_cb() on the same cpu.
+ // * Remember the timer this callback is servicing to prevent
+ // * deadlock if callback_fn() calls bpf_timer_cancel() or
+ // * bpf_map_delete_elem() on the same timer.
+ // */
+ // this_cpu_write(hrtimer_running, t);
+ if (map->map_type == BPF_MAP_TYPE_ARRAY) {
+ struct bpf_array *array = container_of(map, struct bpf_array, map);
+
+ /* compute the key */
+ idx = ((char *)value - array->value) / array->elem_size;
+ key = &idx;
+ } else { /* hash or lru */
+ key = value - round_up(map->key_size, 8);
+ }
+
+ callback_fn((u64)(long)map, (u64)(long)key, (u64)(long)value, 0, 0);
+ /* The verifier checked that return value is zero. */
+
+ // this_cpu_write(hrtimer_running, NULL);
+}
+
static DEFINE_PER_CPU(struct bpf_hrtimer *, hrtimer_running);
static enum hrtimer_restart bpf_timer_cb(struct hrtimer *hrtimer)
{
struct bpf_hrtimer *t = container_of(hrtimer, struct bpf_hrtimer, timer);
+ bpf_callback_t callback_fn, sleepable_cb_fn;
struct bpf_map *map = t->map;
void *value = t->value;
- bpf_callback_t callback_fn;
void *key;
u32 idx;
BTF_TYPE_EMIT(struct bpf_timer);
+ sleepable_cb_fn = rcu_dereference_check(t->sleepable_cb_fn, rcu_read_lock_bh_held());
+ if (sleepable_cb_fn) {
+ schedule_work(&t->work);
+ goto out;
+ }
+
callback_fn = rcu_dereference_check(t->callback_fn, rcu_read_lock_bh_held());
if (!callback_fn)
goto out;
@@ -1190,7 +1238,9 @@ BPF_CALL_3(bpf_timer_init, struct bpf_timer_kern *, timer, struct bpf_map *, map
t->map = map;
t->prog = NULL;
rcu_assign_pointer(t->callback_fn, NULL);
+ rcu_assign_pointer(t->sleepable_cb_fn, NULL);
hrtimer_init(&t->timer, clockid, HRTIMER_MODE_REL_SOFT);
+ INIT_WORK(&t->work, bpf_timer_work_cb);
t->timer.function = bpf_timer_cb;
WRITE_ONCE(timer->timer, t);
/* Guarantee the order between timer->timer and map->usercnt. So
@@ -1221,8 +1271,8 @@ static const struct bpf_func_proto bpf_timer_init_proto = {
.arg3_type = ARG_ANYTHING,
};
-BPF_CALL_3(bpf_timer_set_callback, struct bpf_timer_kern *, timer, void *, callback_fn,
- struct bpf_prog_aux *, aux)
+static int __bpf_timer_set_callback(struct bpf_timer_kern *timer, void *callback_fn,
+ struct bpf_prog_aux *aux, bool is_sleepable)
{
struct bpf_prog *prev, *prog = aux->prog;
struct bpf_hrtimer *t;
@@ -1260,12 +1310,24 @@ BPF_CALL_3(bpf_timer_set_callback, struct bpf_timer_kern *, timer, void *, callb
bpf_prog_put(prev);
t->prog = prog;
}
- rcu_assign_pointer(t->callback_fn, callback_fn);
+ if (is_sleepable) {
+ rcu_assign_pointer(t->sleepable_cb_fn, callback_fn);
+ rcu_assign_pointer(t->callback_fn, NULL);
+ } else {
+ rcu_assign_pointer(t->callback_fn, callback_fn);
+ rcu_assign_pointer(t->sleepable_cb_fn, NULL);
+ }
out:
__bpf_spin_unlock_irqrestore(&timer->lock);
return ret;
}
+BPF_CALL_3(bpf_timer_set_callback, struct bpf_timer_kern *, timer, void *, callback_fn,
+ struct bpf_prog_aux *, aux)
+{
+ return __bpf_timer_set_callback(timer, callback_fn, aux, false);
+}
+
static const struct bpf_func_proto bpf_timer_set_callback_proto = {
.func = bpf_timer_set_callback,
.gpl_only = true,
@@ -1274,6 +1336,20 @@ static const struct bpf_func_proto bpf_timer_set_callback_proto = {
.arg2_type = ARG_PTR_TO_FUNC,
};
+BPF_CALL_3(bpf_timer_set_sleepable_cb, struct bpf_timer_kern *, timer, void *, callback_fn,
+ struct bpf_prog_aux *, aux)
+{
+ return __bpf_timer_set_callback(timer, callback_fn, aux, true);
+}
+
+static const struct bpf_func_proto bpf_timer_set_sleepable_cb_proto = {
+ .func = bpf_timer_set_sleepable_cb,
+ .gpl_only = true,
+ .ret_type = RET_INTEGER,
+ .arg1_type = ARG_PTR_TO_TIMER,
+ .arg2_type = ARG_PTR_TO_FUNC,
+};
+
BPF_CALL_3(bpf_timer_start, struct bpf_timer_kern *, timer, u64, nsecs, u64, flags)
{
struct bpf_hrtimer *t;
@@ -1353,6 +1429,7 @@ BPF_CALL_1(bpf_timer_cancel, struct bpf_timer_kern *, timer)
* if it was running.
*/
ret = ret ?: hrtimer_cancel(&t->timer);
+ ret = ret ?: cancel_work_sync(&t->work);
return ret;
}
@@ -1405,8 +1482,10 @@ void bpf_timer_cancel_and_free(void *val)
* effectively cancelled because bpf_timer_cb() will return
* HRTIMER_NORESTART.
*/
- if (this_cpu_read(hrtimer_running) != t)
+ if (this_cpu_read(hrtimer_running) != t) {
hrtimer_cancel(&t->timer);
+ }
+ cancel_work_sync(&t->work);
kfree(t);
}
@@ -1749,6 +1828,8 @@ bpf_base_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog)
return &bpf_timer_init_proto;
case BPF_FUNC_timer_set_callback:
return &bpf_timer_set_callback_proto;
+ case BPF_FUNC_timer_set_sleepable_cb:
+ return &bpf_timer_set_sleepable_cb_proto;
case BPF_FUNC_timer_start:
return &bpf_timer_start_proto;
case BPF_FUNC_timer_cancel:
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 64fa188d00ad..400c625efe22 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -513,7 +513,8 @@ static bool is_sync_callback_calling_function(enum bpf_func_id func_id)
static bool is_async_callback_calling_function(enum bpf_func_id func_id)
{
- return func_id == BPF_FUNC_timer_set_callback;
+ return func_id == BPF_FUNC_timer_set_callback ||
+ func_id == BPF_FUNC_timer_set_sleepable_cb;
}
static bool is_callback_calling_function(enum bpf_func_id func_id)
@@ -1414,6 +1415,7 @@ static int copy_verifier_state(struct bpf_verifier_state *dst_state,
}
dst_state->speculative = src->speculative;
dst_state->active_rcu_lock = src->active_rcu_lock;
+ dst_state->in_sleepable = src->in_sleepable;
dst_state->curframe = src->curframe;
dst_state->active_lock.ptr = src->active_lock.ptr;
dst_state->active_lock.id = src->active_lock.id;
@@ -9434,11 +9436,13 @@ static int push_callback_call(struct bpf_verifier_env *env, struct bpf_insn *ins
if (insn->code == (BPF_JMP | BPF_CALL) &&
insn->src_reg == 0 &&
- insn->imm == BPF_FUNC_timer_set_callback) {
+ (insn->imm == BPF_FUNC_timer_set_callback ||
+ insn->imm == BPF_FUNC_timer_set_sleepable_cb)) {
struct bpf_verifier_state *async_cb;
/* there is no real recursion here. timer callbacks are async */
env->subprog_info[subprog].is_async_cb = true;
+ env->subprog_info[subprog].is_sleepable = insn->imm == BPF_FUNC_timer_set_sleepable_cb;
async_cb = push_async_cb(env, env->subprog_info[subprog].start,
insn_idx, subprog);
if (!async_cb)
@@ -10280,6 +10284,8 @@ static int check_helper_call(struct bpf_verifier_env *env, struct bpf_insn *insn
set_map_elem_callback_state);
break;
case BPF_FUNC_timer_set_callback:
+ fallthrough;
+ case BPF_FUNC_timer_set_sleepable_cb:
err = push_callback_call(env, insn, insn_idx, meta.subprogno,
set_timer_callback_state);
break;
@@ -15586,7 +15592,9 @@ static int visit_insn(int t, struct bpf_verifier_env *env)
return DONE_EXPLORING;
case BPF_CALL:
- if (insn->src_reg == 0 && insn->imm == BPF_FUNC_timer_set_callback)
+ if (insn->src_reg == 0 &&
+ (insn->imm == BPF_FUNC_timer_set_callback ||
+ insn->imm == BPF_FUNC_timer_set_sleepable_cb))
/* Mark this call insn as a prune point to trigger
* is_state_visited() check before call itself is
* processed by __check_func_call(). Otherwise new
@@ -16762,6 +16770,9 @@ static bool states_equal(struct bpf_verifier_env *env,
if (old->active_rcu_lock != cur->active_rcu_lock)
return false;
+ if (old->in_sleepable != cur->in_sleepable)
+ return false;
+
/* for states to be equal callsites have to be the same
* and all frame states need to be equivalent
*/
@@ -19639,7 +19650,8 @@ static int do_misc_fixups(struct bpf_verifier_env *env)
continue;
}
- if (insn->imm == BPF_FUNC_timer_set_callback) {
+ if (insn->imm == BPF_FUNC_timer_set_callback ||
+ insn->imm == BPF_FUNC_timer_set_sleepable_cb) {
/* The verifier will process callback_fn as many times as necessary
* with different maps and the register states prepared by
* set_timer_callback_state will be accurate.
diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h
index d96708380e52..ef1f2be4cfef 100644
--- a/tools/include/uapi/linux/bpf.h
+++ b/tools/include/uapi/linux/bpf.h
@@ -5742,6 +5742,18 @@ union bpf_attr {
* 0 on success.
*
* **-ENOENT** if the bpf_local_storage cannot be found.
+ *
+ * long bpf_timer_set_sleepable_cb(struct bpf_timer *timer, void *callback_fn)
+ * Description
+ * Configure the timer to call *callback_fn* static function in a
+ * sleepable context.
+ * Return
+ * 0 on success.
+ * **-EINVAL** if *timer* was not initialized with bpf_timer_init() earlier.
+ * **-EPERM** if *timer* is in a map that doesn't have any user references.
+ * The user space should either hold a file descriptor to a map with timers
+ * or pin such map in bpffs. When map is unpinned or file descriptor is
+ * closed all timers in the map will be cancelled and freed.
*/
#define ___BPF_FUNC_MAPPER(FN, ctx...) \
FN(unspec, 0, ##ctx) \
@@ -5956,6 +5968,7 @@ union bpf_attr {
FN(user_ringbuf_drain, 209, ##ctx) \
FN(cgrp_storage_get, 210, ##ctx) \
FN(cgrp_storage_delete, 211, ##ctx) \
+ FN(timer_set_sleepable_cb, 212, ##ctx) \
/* */
/* backwards-compatibility macros for users of __BPF_FUNC_MAPPER that don't
--
2.43.0
---
From 6c654010a4660fd26ffce44406dba308ded3b465 Mon Sep 17 00:00:00 2001
From: Benjamin Tissoires <[email protected]>
Date: Tue, 13 Feb 2024 18:40:02 +0100
Subject: [PATCH RFC bpf-next v2 02/11] bpf/verifier: introduce in_sleepable() helper
MIME-Version: 1.0
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: 7bit
Message-Id: <[email protected]>
Signed-off-by: Benjamin Tissoires <[email protected]>
---
kernel/bpf/verifier.c | 20 +++++++++++++-------
1 file changed, 13 insertions(+), 7 deletions(-)
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 400c625efe22..8c3707d27c02 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -5257,6 +5257,12 @@ static int map_kptr_match_type(struct bpf_verifier_env *env,
return -EINVAL;
}
+static bool in_sleepable(struct bpf_verifier_env *env)
+{
+ return env->prog->aux->sleepable ||
+ (env->cur_state && env->cur_state->in_sleepable);
+}
+
/* The non-sleepable programs and sleepable programs with explicit bpf_rcu_read_lock()
* can dereference RCU protected pointers and result is PTR_TRUSTED.
*/
@@ -5264,7 +5270,7 @@ static bool in_rcu_cs(struct bpf_verifier_env *env)
{
return env->cur_state->active_rcu_lock ||
env->cur_state->active_lock.ptr ||
- !env->prog->aux->sleepable;
+ !in_sleepable(env);
}
/* Once GCC supports btf_type_tag the following mechanism will be replaced with tag check */
@@ -10153,7 +10159,7 @@ static int check_helper_call(struct bpf_verifier_env *env, struct bpf_insn *insn
return -EINVAL;
}
- if (!env->prog->aux->sleepable && fn->might_sleep) {
+ if (!in_sleepable(env) && fn->might_sleep) {
verbose(env, "helper call might sleep in a non-sleepable prog\n");
return -EINVAL;
}
@@ -10183,7 +10189,7 @@ static int check_helper_call(struct bpf_verifier_env *env, struct bpf_insn *insn
return -EINVAL;
}
- if (env->prog->aux->sleepable && is_storage_get_function(func_id))
+ if (in_sleepable(env) && is_storage_get_function(func_id))
env->insn_aux_data[insn_idx].storage_get_func_atomic = true;
}
@@ -11544,7 +11550,7 @@ static bool check_css_task_iter_allowlist(struct bpf_verifier_env *env)
return true;
fallthrough;
default:
- return env->prog->aux->sleepable;
+ return in_sleepable(env);
}
}
@@ -12065,7 +12071,7 @@ static int check_kfunc_call(struct bpf_verifier_env *env, struct bpf_insn *insn,
}
sleepable = is_kfunc_sleepable(&meta);
- if (sleepable && !env->prog->aux->sleepable) {
+ if (sleepable && !in_sleepable(env)) {
verbose(env, "program must be sleepable to call sleepable kfunc %s\n", func_name);
return -EACCES;
}
@@ -18208,7 +18214,7 @@ static int resolve_pseudo_ldimm64(struct bpf_verifier_env *env)
return -E2BIG;
}
- if (env->prog->aux->sleepable)
+ if (in_sleepable(env))
atomic64_inc(&map->sleepable_refcnt);
/* hold the map. If the program is rejected by verifier,
* the map will be released by release_maps() or it
@@ -19685,7 +19691,7 @@ static int do_misc_fixups(struct bpf_verifier_env *env)
}
if (is_storage_get_function(insn->imm)) {
- if (!env->prog->aux->sleepable ||
+ if (!in_sleepable(env) ||
env->insn_aux_data[i + delta].storage_get_func_atomic)
insn_buf[0] = BPF_MOV64_IMM(BPF_REG_5, (__force __s32)GFP_ATOMIC);
else
--
2.43.0
---
Cheers,
Benjamin
On Tue, 13 Feb 2024 at 18:46, Benjamin Tissoires <[email protected]> wrote:
>
> On Feb 12 2024, Alexei Starovoitov wrote:
> > On Mon, Feb 12, 2024 at 10:21 AM Benjamin Tissoires
> > <[email protected]> wrote:
> > >
> > > On Mon, Feb 12, 2024 at 6:46 PM Toke Høiland-Jørgensen <[email protected]> wrote:
> > > >
> > > > Benjamin Tissoires <[email protected]> writes:
> > > >
> [...]
> > I agree that workqueue delegation fits into the bpf_timer concept and
> > a lot of code can and should be shared.
>
> Thanks Alexei for the detailed answer. I've given it an attempt but still can not
> figure it out entirely.
>
> > All the lessons(bugs) learned with bpf_timer don't need to be re-discovered :)
> > Too bad, bpf_timer_set_callback() doesn't have a flag argument,
> > so we need a new kfunc to set a sleepable callback.
> > Maybe
> > bpf_timer_set_sleepable_cb() ?
>
> OK. So I guess I should drop Toke's suggestion with the bpf_timer_ini() flag?
>
> > The verifier will set is_async_cb = true for it (like it does for regular cb-s).
> > And since prog->aux->sleepable is kinda "global" we need another
> > per subprog flag:
> > bool is_sleepable: 1;
>
> done (in push_callback_call())
>
> >
> > We can factor out a check "if (prog->aux->sleepable)" into a helper
> > that will check that "global" flag and another env->cur_state->in_sleepable
> > flag that will work similar to active_rcu_lock.
>
> done (I think), cf patch 2 below
>
> > Once the verifier starts processing subprog->is_sleepable
> > it will set cur_state->in_sleepable = true;
> > to make all subprogs called from that cb to be recognized as sleepable too.
>
> That's the point I don't know where to put the new code.
>
I think that would go in the already existing special case for
push_async_cb where you get the verifier state of the async callback.
You can make setting the boolean in that verifier state conditional on
whether it's your kfunc/helper you're processing taking a sleepable
callback.
> It seems the best place would be in do_check(), but I am under the impression
> that the code of the callback is added at the end of the instruction list, meaning
> that I do not know where it starts, and which subprog index it corresponds to.
>
> >
> > A bit of a challenge is what to do with global subprogs,
> > since they're verified lazily. They can be called from
> > sleepable and non-sleepable contex. Should be solvable.
>
> I must confess this is way over me (and given that I didn't even managed to make
> the "easy" case working, that might explain things a little :-P )
>
I think it will be solvable but made somewhat difficult by the fact
that even if we mark subprog_info of some global_func A as
in_sleepable, so that we explore it as sleepable during its
verification, we might encounter later another global_func that calls
a global func, already explored as non-sleepable, in sleepable
context. In this case I think we need to redo the verification of that
global func as sleepable once again. It could be that it is called
from both non-sleepable and sleepable contexts, so both paths
(in_sleepable = true, and in_sleepable = false) need to be explored,
or we could reject such cases, but it might be a little restrictive.
Some common helper global func unrelated to caller context doing some
auxiliary work, called from sleepable timer callback and normal main
subprog might be an example where rejection will be prohibitive.
An approach might be to explore main and global subprogs once as we do
now, and then keep a list of global subprogs that need to be revisited
as in_sleepable (due to being called from a sleepable context) and
trigger do_check_common for them again, this might have to be repeated
as the list grows on each iteration, but eventually we will have
explored all of them as in_sleepable if need be, and the loop will
end. Surely, this trades off logical simplicity of verifier code with
redoing verification of global subprogs again.
To add items to such a list, for each global subprog we encounter that
needs to be analyzed as in_sleepable, we will also collect all its
callee global subprogs by walking its instructions (a bit like
check_max_stack_depth does).
> >
> > Overall I think this feature is needed urgently,
> > so if you don't have cycles to work on this soon,
> > I can prioritize it right after bpf_arena work.
>
> I can try to spare a few cycles on it. Even if your instructions were on
> spot, I still can't make the subprogs recognized as sleepable.
>
> For reference, this is where I am (probably bogus, but seems to be
> working when timer_set_sleepable_cb() is called from a sleepable context
> as mentioned by Toke):
>
I just skimmed the patch but I think it's already 90% there. The only
other change I would suggest is switching from helper to kfunc, as
originally proposed by Alexei.
> [...]
Kumar Kartikeya Dwivedi <[email protected]> writes:
> On Tue, 13 Feb 2024 at 18:46, Benjamin Tissoires <[email protected]> wrote:
>>
>> On Feb 12 2024, Alexei Starovoitov wrote:
>> > On Mon, Feb 12, 2024 at 10:21 AM Benjamin Tissoires
>> > <[email protected]> wrote:
>> > >
>> > > On Mon, Feb 12, 2024 at 6:46 PM Toke Høiland-Jørgensen <[email protected]> wrote:
>> > > >
>> > > > Benjamin Tissoires <[email protected]> writes:
>> > > >
>> [...]
>> > I agree that workqueue delegation fits into the bpf_timer concept and
>> > a lot of code can and should be shared.
>>
>> Thanks Alexei for the detailed answer. I've given it an attempt but still can not
>> figure it out entirely.
>>
>> > All the lessons(bugs) learned with bpf_timer don't need to be re-discovered :)
>> > Too bad, bpf_timer_set_callback() doesn't have a flag argument,
>> > so we need a new kfunc to set a sleepable callback.
>> > Maybe
>> > bpf_timer_set_sleepable_cb() ?
>>
>> OK. So I guess I should drop Toke's suggestion with the bpf_timer_ini() flag?
>>
>> > The verifier will set is_async_cb = true for it (like it does for regular cb-s).
>> > And since prog->aux->sleepable is kinda "global" we need another
>> > per subprog flag:
>> > bool is_sleepable: 1;
>>
>> done (in push_callback_call())
>>
>> >
>> > We can factor out a check "if (prog->aux->sleepable)" into a helper
>> > that will check that "global" flag and another env->cur_state->in_sleepable
>> > flag that will work similar to active_rcu_lock.
>>
>> done (I think), cf patch 2 below
>>
>> > Once the verifier starts processing subprog->is_sleepable
>> > it will set cur_state->in_sleepable = true;
>> > to make all subprogs called from that cb to be recognized as sleepable too.
>>
>> That's the point I don't know where to put the new code.
>>
>
> I think that would go in the already existing special case for
> push_async_cb where you get the verifier state of the async callback.
> You can make setting the boolean in that verifier state conditional on
> whether it's your kfunc/helper you're processing taking a sleepable
> callback.
>
>> It seems the best place would be in do_check(), but I am under the impression
>> that the code of the callback is added at the end of the instruction list, meaning
>> that I do not know where it starts, and which subprog index it corresponds to.
>>
>> >
>> > A bit of a challenge is what to do with global subprogs,
>> > since they're verified lazily. They can be called from
>> > sleepable and non-sleepable contex. Should be solvable.
>>
>> I must confess this is way over me (and given that I didn't even managed to make
>> the "easy" case working, that might explain things a little :-P )
>>
>
> I think it will be solvable but made somewhat difficult by the fact
> that even if we mark subprog_info of some global_func A as
> in_sleepable, so that we explore it as sleepable during its
> verification, we might encounter later another global_func that calls
> a global func, already explored as non-sleepable, in sleepable
> context. In this case I think we need to redo the verification of that
> global func as sleepable once again. It could be that it is called
> from both non-sleepable and sleepable contexts, so both paths
> (in_sleepable = true, and in_sleepable = false) need to be explored,
> or we could reject such cases, but it might be a little restrictive.
>
> Some common helper global func unrelated to caller context doing some
> auxiliary work, called from sleepable timer callback and normal main
> subprog might be an example where rejection will be prohibitive.
>
> An approach might be to explore main and global subprogs once as we do
> now, and then keep a list of global subprogs that need to be revisited
> as in_sleepable (due to being called from a sleepable context) and
> trigger do_check_common for them again, this might have to be repeated
> as the list grows on each iteration, but eventually we will have
> explored all of them as in_sleepable if need be, and the loop will
> end. Surely, this trades off logical simplicity of verifier code with
> redoing verification of global subprogs again.
>
> To add items to such a list, for each global subprog we encounter that
> needs to be analyzed as in_sleepable, we will also collect all its
> callee global subprogs by walking its instructions (a bit like
> check_max_stack_depth does).
Sorry if I'm being dense, but why is all this needed if it's already
possible to just define the timer callback from a program type that
allows sleeping, and then set the actual timeout from a different
program that is not sleepable? Isn't the set_sleepable_cb() kfunc just a
convenience then? Or did I misunderstand and it's not actually possible
to mix callback/timer arming from different program types?
-Toke
On Tue, Feb 13, 2024 at 08:23:17PM +0100, Kumar Kartikeya Dwivedi wrote:
> On Tue, 13 Feb 2024 at 18:46, Benjamin Tissoires <[email protected]> wrote:
> >
> > On Feb 12 2024, Alexei Starovoitov wrote:
> > > On Mon, Feb 12, 2024 at 10:21 AM Benjamin Tissoires
> > > <[email protected]> wrote:
> > > >
> > > > On Mon, Feb 12, 2024 at 6:46 PM Toke Høiland-Jørgensen <[email protected]> wrote:
> > > > >
> > > > > Benjamin Tissoires <[email protected]> writes:
> > > > >
> > [...]
> > > I agree that workqueue delegation fits into the bpf_timer concept and
> > > a lot of code can and should be shared.
> >
> > Thanks Alexei for the detailed answer. I've given it an attempt but still can not
> > figure it out entirely.
> >
> > > All the lessons(bugs) learned with bpf_timer don't need to be re-discovered :)
> > > Too bad, bpf_timer_set_callback() doesn't have a flag argument,
> > > so we need a new kfunc to set a sleepable callback.
> > > Maybe
> > > bpf_timer_set_sleepable_cb() ?
> >
> > OK. So I guess I should drop Toke's suggestion with the bpf_timer_ini() flag?
I think the flag for bpf_timer_init() is still needed.
Since we probably shouldn't combine hrtimer with workqueue.
bpf_timer_start() should do schedule_work() directly.
The latency matters in some cases.
We will use flags to specify which workqueue strategy to use.
Fingers crossed, WQ_BH will be merged in the next merge window and we'd need to expose it
as an option to bpf progs.
> >
> > > The verifier will set is_async_cb = true for it (like it does for regular cb-s).
> > > And since prog->aux->sleepable is kinda "global" we need another
> > > per subprog flag:
> > > bool is_sleepable: 1;
> >
> > done (in push_callback_call())
> >
> > >
> > > We can factor out a check "if (prog->aux->sleepable)" into a helper
> > > that will check that "global" flag and another env->cur_state->in_sleepable
> > > flag that will work similar to active_rcu_lock.
> >
> > done (I think), cf patch 2 below
> >
> > > Once the verifier starts processing subprog->is_sleepable
> > > it will set cur_state->in_sleepable = true;
> > > to make all subprogs called from that cb to be recognized as sleepable too.
> >
> > That's the point I don't know where to put the new code.
> >
>
> I think that would go in the already existing special case for
> push_async_cb where you get the verifier state of the async callback.
> You can make setting the boolean in that verifier state conditional on
> whether it's your kfunc/helper you're processing taking a sleepable
> callback.
>
> > It seems the best place would be in do_check(), but I am under the impression
> > that the code of the callback is added at the end of the instruction list, meaning
> > that I do not know where it starts, and which subprog index it corresponds to.
> >
> > >
> > > A bit of a challenge is what to do with global subprogs,
> > > since they're verified lazily. They can be called from
> > > sleepable and non-sleepable contex. Should be solvable.
> >
> > I must confess this is way over me (and given that I didn't even managed to make
> > the "easy" case working, that might explain things a little :-P )
> >
>
> I think it will be solvable but made somewhat difficult by the fact
> that even if we mark subprog_info of some global_func A as
> in_sleepable, so that we explore it as sleepable during its
> verification, we might encounter later another global_func that calls
> a global func, already explored as non-sleepable, in sleepable
> context. In this case I think we need to redo the verification of that
> global func as sleepable once again. It could be that it is called
> from both non-sleepable and sleepable contexts, so both paths
> (in_sleepable = true, and in_sleepable = false) need to be explored,
> or we could reject such cases, but it might be a little restrictive.
>
> Some common helper global func unrelated to caller context doing some
> auxiliary work, called from sleepable timer callback and normal main
> subprog might be an example where rejection will be prohibitive.
>
> An approach might be to explore main and global subprogs once as we do
> now, and then keep a list of global subprogs that need to be revisited
> as in_sleepable (due to being called from a sleepable context) and
> trigger do_check_common for them again, this might have to be repeated
> as the list grows on each iteration, but eventually we will have
> explored all of them as in_sleepable if need be, and the loop will
> end. Surely, this trades off logical simplicity of verifier code with
> redoing verification of global subprogs again.
>
> To add items to such a list, for each global subprog we encounter that
> needs to be analyzed as in_sleepable, we will also collect all its
> callee global subprogs by walking its instructions (a bit like
> check_max_stack_depth does).
imo that would be a serious increase in the verifier complexity.
It's not clear whether this is needed at this point.
For now I would drop cur_state->in_sleepable to false before verifying global subprogs.
> > >
> > > Overall I think this feature is needed urgently,
> > > so if you don't have cycles to work on this soon,
> > > I can prioritize it right after bpf_arena work.
> >
> > I can try to spare a few cycles on it. Even if your instructions were on
> > spot, I still can't make the subprogs recognized as sleepable.
> >
> > For reference, this is where I am (probably bogus, but seems to be
> > working when timer_set_sleepable_cb() is called from a sleepable context
> > as mentioned by Toke):
> >
>
> I just skimmed the patch but I think it's already 90% there. The only
> other change I would suggest is switching from helper to kfunc, as
> originally proposed by Alexei.
+1. I quickly looked through the patches and it does look like 90% there.
On Tue, Feb 13, 2024 at 08:51:26PM +0100, Toke Høiland-Jørgensen wrote:
> Kumar Kartikeya Dwivedi <[email protected]> writes:
>
> > On Tue, 13 Feb 2024 at 18:46, Benjamin Tissoires <[email protected]> wrote:
> >>
> >> On Feb 12 2024, Alexei Starovoitov wrote:
> >> > On Mon, Feb 12, 2024 at 10:21 AM Benjamin Tissoires
> >> > <[email protected]> wrote:
> >> > >
> >> > > On Mon, Feb 12, 2024 at 6:46 PM Toke Høiland-Jørgensen <[email protected]> wrote:
> >> > > >
> >> > > > Benjamin Tissoires <[email protected]> writes:
> >> > > >
> >> [...]
> >> > I agree that workqueue delegation fits into the bpf_timer concept and
> >> > a lot of code can and should be shared.
> >>
> >> Thanks Alexei for the detailed answer. I've given it an attempt but still can not
> >> figure it out entirely.
> >>
> >> > All the lessons(bugs) learned with bpf_timer don't need to be re-discovered :)
> >> > Too bad, bpf_timer_set_callback() doesn't have a flag argument,
> >> > so we need a new kfunc to set a sleepable callback.
> >> > Maybe
> >> > bpf_timer_set_sleepable_cb() ?
> >>
> >> OK. So I guess I should drop Toke's suggestion with the bpf_timer_ini() flag?
> >>
> >> > The verifier will set is_async_cb = true for it (like it does for regular cb-s).
> >> > And since prog->aux->sleepable is kinda "global" we need another
> >> > per subprog flag:
> >> > bool is_sleepable: 1;
> >>
> >> done (in push_callback_call())
> >>
> >> >
> >> > We can factor out a check "if (prog->aux->sleepable)" into a helper
> >> > that will check that "global" flag and another env->cur_state->in_sleepable
> >> > flag that will work similar to active_rcu_lock.
> >>
> >> done (I think), cf patch 2 below
> >>
> >> > Once the verifier starts processing subprog->is_sleepable
> >> > it will set cur_state->in_sleepable = true;
> >> > to make all subprogs called from that cb to be recognized as sleepable too.
> >>
> >> That's the point I don't know where to put the new code.
> >>
> >
> > I think that would go in the already existing special case for
> > push_async_cb where you get the verifier state of the async callback.
> > You can make setting the boolean in that verifier state conditional on
> > whether it's your kfunc/helper you're processing taking a sleepable
> > callback.
> >
> >> It seems the best place would be in do_check(), but I am under the impression
> >> that the code of the callback is added at the end of the instruction list, meaning
> >> that I do not know where it starts, and which subprog index it corresponds to.
> >>
> >> >
> >> > A bit of a challenge is what to do with global subprogs,
> >> > since they're verified lazily. They can be called from
> >> > sleepable and non-sleepable contex. Should be solvable.
> >>
> >> I must confess this is way over me (and given that I didn't even managed to make
> >> the "easy" case working, that might explain things a little :-P )
> >>
> >
> > I think it will be solvable but made somewhat difficult by the fact
> > that even if we mark subprog_info of some global_func A as
> > in_sleepable, so that we explore it as sleepable during its
> > verification, we might encounter later another global_func that calls
> > a global func, already explored as non-sleepable, in sleepable
> > context. In this case I think we need to redo the verification of that
> > global func as sleepable once again. It could be that it is called
> > from both non-sleepable and sleepable contexts, so both paths
> > (in_sleepable = true, and in_sleepable = false) need to be explored,
> > or we could reject such cases, but it might be a little restrictive.
> >
> > Some common helper global func unrelated to caller context doing some
> > auxiliary work, called from sleepable timer callback and normal main
> > subprog might be an example where rejection will be prohibitive.
> >
> > An approach might be to explore main and global subprogs once as we do
> > now, and then keep a list of global subprogs that need to be revisited
> > as in_sleepable (due to being called from a sleepable context) and
> > trigger do_check_common for them again, this might have to be repeated
> > as the list grows on each iteration, but eventually we will have
> > explored all of them as in_sleepable if need be, and the loop will
> > end. Surely, this trades off logical simplicity of verifier code with
> > redoing verification of global subprogs again.
> >
> > To add items to such a list, for each global subprog we encounter that
> > needs to be analyzed as in_sleepable, we will also collect all its
> > callee global subprogs by walking its instructions (a bit like
> > check_max_stack_depth does).
>
> Sorry if I'm being dense, but why is all this needed if it's already
> possible to just define the timer callback from a program type that
> allows sleeping, and then set the actual timeout from a different
> program that is not sleepable? Isn't the set_sleepable_cb() kfunc just a
> convenience then? Or did I misunderstand and it's not actually possible
> to mix callback/timer arming from different program types?
More than just convience.
bpf_set_sleepable_cb() might need to be called from non-sleepable and
there could be no way to hack it around with fake sleepable entry.
bpf_timer_cancel() clears callback_fn.
So if prog wants to bpf_timer_start() and later bpf_timer_cancel()
it would need to bpf_set_sleepable_cb() every time before bpf_timer_start().
And at that time it might be in non-sleepable ctx.
Alexei Starovoitov <[email protected]> writes:
> On Tue, Feb 13, 2024 at 08:51:26PM +0100, Toke Høiland-Jørgensen wrote:
>> Kumar Kartikeya Dwivedi <[email protected]> writes:
>>
>> > On Tue, 13 Feb 2024 at 18:46, Benjamin Tissoires <[email protected]> wrote:
>> >>
>> >> On Feb 12 2024, Alexei Starovoitov wrote:
>> >> > On Mon, Feb 12, 2024 at 10:21 AM Benjamin Tissoires
>> >> > <[email protected]> wrote:
>> >> > >
>> >> > > On Mon, Feb 12, 2024 at 6:46 PM Toke Høiland-Jørgensen <[email protected]> wrote:
>> >> > > >
>> >> > > > Benjamin Tissoires <[email protected]> writes:
>> >> > > >
>> >> [...]
>> >> > I agree that workqueue delegation fits into the bpf_timer concept and
>> >> > a lot of code can and should be shared.
>> >>
>> >> Thanks Alexei for the detailed answer. I've given it an attempt but still can not
>> >> figure it out entirely.
>> >>
>> >> > All the lessons(bugs) learned with bpf_timer don't need to be re-discovered :)
>> >> > Too bad, bpf_timer_set_callback() doesn't have a flag argument,
>> >> > so we need a new kfunc to set a sleepable callback.
>> >> > Maybe
>> >> > bpf_timer_set_sleepable_cb() ?
>> >>
>> >> OK. So I guess I should drop Toke's suggestion with the bpf_timer_ini() flag?
>> >>
>> >> > The verifier will set is_async_cb = true for it (like it does for regular cb-s).
>> >> > And since prog->aux->sleepable is kinda "global" we need another
>> >> > per subprog flag:
>> >> > bool is_sleepable: 1;
>> >>
>> >> done (in push_callback_call())
>> >>
>> >> >
>> >> > We can factor out a check "if (prog->aux->sleepable)" into a helper
>> >> > that will check that "global" flag and another env->cur_state->in_sleepable
>> >> > flag that will work similar to active_rcu_lock.
>> >>
>> >> done (I think), cf patch 2 below
>> >>
>> >> > Once the verifier starts processing subprog->is_sleepable
>> >> > it will set cur_state->in_sleepable = true;
>> >> > to make all subprogs called from that cb to be recognized as sleepable too.
>> >>
>> >> That's the point I don't know where to put the new code.
>> >>
>> >
>> > I think that would go in the already existing special case for
>> > push_async_cb where you get the verifier state of the async callback.
>> > You can make setting the boolean in that verifier state conditional on
>> > whether it's your kfunc/helper you're processing taking a sleepable
>> > callback.
>> >
>> >> It seems the best place would be in do_check(), but I am under the impression
>> >> that the code of the callback is added at the end of the instruction list, meaning
>> >> that I do not know where it starts, and which subprog index it corresponds to.
>> >>
>> >> >
>> >> > A bit of a challenge is what to do with global subprogs,
>> >> > since they're verified lazily. They can be called from
>> >> > sleepable and non-sleepable contex. Should be solvable.
>> >>
>> >> I must confess this is way over me (and given that I didn't even managed to make
>> >> the "easy" case working, that might explain things a little :-P )
>> >>
>> >
>> > I think it will be solvable but made somewhat difficult by the fact
>> > that even if we mark subprog_info of some global_func A as
>> > in_sleepable, so that we explore it as sleepable during its
>> > verification, we might encounter later another global_func that calls
>> > a global func, already explored as non-sleepable, in sleepable
>> > context. In this case I think we need to redo the verification of that
>> > global func as sleepable once again. It could be that it is called
>> > from both non-sleepable and sleepable contexts, so both paths
>> > (in_sleepable = true, and in_sleepable = false) need to be explored,
>> > or we could reject such cases, but it might be a little restrictive.
>> >
>> > Some common helper global func unrelated to caller context doing some
>> > auxiliary work, called from sleepable timer callback and normal main
>> > subprog might be an example where rejection will be prohibitive.
>> >
>> > An approach might be to explore main and global subprogs once as we do
>> > now, and then keep a list of global subprogs that need to be revisited
>> > as in_sleepable (due to being called from a sleepable context) and
>> > trigger do_check_common for them again, this might have to be repeated
>> > as the list grows on each iteration, but eventually we will have
>> > explored all of them as in_sleepable if need be, and the loop will
>> > end. Surely, this trades off logical simplicity of verifier code with
>> > redoing verification of global subprogs again.
>> >
>> > To add items to such a list, for each global subprog we encounter that
>> > needs to be analyzed as in_sleepable, we will also collect all its
>> > callee global subprogs by walking its instructions (a bit like
>> > check_max_stack_depth does).
>>
>> Sorry if I'm being dense, but why is all this needed if it's already
>> possible to just define the timer callback from a program type that
>> allows sleeping, and then set the actual timeout from a different
>> program that is not sleepable? Isn't the set_sleepable_cb() kfunc just a
>> convenience then? Or did I misunderstand and it's not actually possible
>> to mix callback/timer arming from different program types?
>
> More than just convience.
> bpf_set_sleepable_cb() might need to be called from non-sleepable and
> there could be no way to hack it around with fake sleepable entry.
> bpf_timer_cancel() clears callback_fn.
> So if prog wants to bpf_timer_start() and later bpf_timer_cancel()
> it would need to bpf_set_sleepable_cb() every time before bpf_timer_start().
> And at that time it might be in non-sleepable ctx.
Ah, right, makes sense; didn't think about bpf_timer_cancel(). Thanks
for the explanation :)
-Toke
On Feb 13 2024, Kumar Kartikeya Dwivedi wrote:
> On Tue, 13 Feb 2024 at 18:46, Benjamin Tissoires <[email protected]> wrote:
> >
> > On Feb 12 2024, Alexei Starovoitov wrote:
> > > On Mon, Feb 12, 2024 at 10:21 AM Benjamin Tissoires
> > > <[email protected]> wrote:
> > > >
> > > > On Mon, Feb 12, 2024 at 6:46 PM Toke Høiland-Jørgensen <[email protected]> wrote:
> > > > >
> > > > > Benjamin Tissoires <[email protected]> writes:
> > > > >
> > [...]
> > > I agree that workqueue delegation fits into the bpf_timer concept and
> > > a lot of code can and should be shared.
> >
> > Thanks Alexei for the detailed answer. I've given it an attempt but still can not
> > figure it out entirely.
> >
> > > All the lessons(bugs) learned with bpf_timer don't need to be re-discovered :)
> > > Too bad, bpf_timer_set_callback() doesn't have a flag argument,
> > > so we need a new kfunc to set a sleepable callback.
> > > Maybe
> > > bpf_timer_set_sleepable_cb() ?
> >
> > OK. So I guess I should drop Toke's suggestion with the bpf_timer_ini() flag?
> >
> > > The verifier will set is_async_cb = true for it (like it does for regular cb-s).
> > > And since prog->aux->sleepable is kinda "global" we need another
> > > per subprog flag:
> > > bool is_sleepable: 1;
> >
> > done (in push_callback_call())
> >
> > >
> > > We can factor out a check "if (prog->aux->sleepable)" into a helper
> > > that will check that "global" flag and another env->cur_state->in_sleepable
> > > flag that will work similar to active_rcu_lock.
> >
> > done (I think), cf patch 2 below
> >
> > > Once the verifier starts processing subprog->is_sleepable
> > > it will set cur_state->in_sleepable = true;
> > > to make all subprogs called from that cb to be recognized as sleepable too.
> >
> > That's the point I don't know where to put the new code.
> >
>
> I think that would go in the already existing special case for
> push_async_cb where you get the verifier state of the async callback.
> You can make setting the boolean in that verifier state conditional on
> whether it's your kfunc/helper you're processing taking a sleepable
> callback.
Hehe, thanks a lot. Indeed, it was a simple fix. I tried to put this
everywhere but here, and with your help got it working in 2 mins :)
>
> > It seems the best place would be in do_check(), but I am under the impression
> > that the code of the callback is added at the end of the instruction list, meaning
> > that I do not know where it starts, and which subprog index it corresponds to.
> >
> > >
> > > A bit of a challenge is what to do with global subprogs,
> > > since they're verified lazily. They can be called from
> > > sleepable and non-sleepable contex. Should be solvable.
> >
> > I must confess this is way over me (and given that I didn't even managed to make
> > the "easy" case working, that might explain things a little :-P )
> >
>
> I think it will be solvable but made somewhat difficult by the fact
> that even if we mark subprog_info of some global_func A as
> in_sleepable, so that we explore it as sleepable during its
> verification, we might encounter later another global_func that calls
> a global func, already explored as non-sleepable, in sleepable
> context. In this case I think we need to redo the verification of that
> global func as sleepable once again. It could be that it is called
> from both non-sleepable and sleepable contexts, so both paths
> (in_sleepable = true, and in_sleepable = false) need to be explored,
> or we could reject such cases, but it might be a little restrictive.
>
> Some common helper global func unrelated to caller context doing some
> auxiliary work, called from sleepable timer callback and normal main
> subprog might be an example where rejection will be prohibitive.
>
> An approach might be to explore main and global subprogs once as we do
> now, and then keep a list of global subprogs that need to be revisited
> as in_sleepable (due to being called from a sleepable context) and
> trigger do_check_common for them again, this might have to be repeated
> as the list grows on each iteration, but eventually we will have
> explored all of them as in_sleepable if need be, and the loop will
> end. Surely, this trades off logical simplicity of verifier code with
> redoing verification of global subprogs again.
>
> To add items to such a list, for each global subprog we encounter that
> needs to be analyzed as in_sleepable, we will also collect all its
> callee global subprogs by walking its instructions (a bit like
> check_max_stack_depth does).
FWIW, this (or Alexei's suggestion) is still not implemented in v2
>
> > >
> > > Overall I think this feature is needed urgently,
> > > so if you don't have cycles to work on this soon,
> > > I can prioritize it right after bpf_arena work.
> >
> > I can try to spare a few cycles on it. Even if your instructions were on
> > spot, I still can't make the subprogs recognized as sleepable.
> >
> > For reference, this is where I am (probably bogus, but seems to be
> > working when timer_set_sleepable_cb() is called from a sleepable context
> > as mentioned by Toke):
> >
>
> I just skimmed the patch but I think it's already 90% there. The only
> other change I would suggest is switching from helper to kfunc, as
> originally proposed by Alexei.
the kfunc was a rabbit hole:
- I needed to teach the verifier about BPF_TIMER in kfunc
- I needed to teach the verifier about the kfunc itself
- I'm failing at calling the callback :(
Anyway, I'm about to send a second RFC so we can discuss on the code and
see where my monkey patching capabilities are reaching their limits.
Cheers,
Benjamin