'struct cpumask' is a bitmap data structure in the kernel whose indices
reflect the CPUs on the system. Commonly, cpumasks are used to track
which CPUs a task is affinitized to, but they can also be used to e.g.
track which cores are associated with a scheduling domain, which cores
on a machine are idle, etc.
It would be useful to be able to query those cpumasks from BPF programs.
For example, when tracing percpu operations, it would be convenient to
have cpumask support if the tracing program wants to track which tasks
end up running on which CPUs in different time intervals, and to check
their cpumask distribution while doing so. Similarly, if we're tracking
NUMA allocations, CPU scheduling domain associations, etc, it would be
useful to be able to concretely compare decisions made by the kernel to
a task's cpumask.
So as to enable such use cases, this patch set proposes a set of kfuncs,
namespaced to bpf_cpumask_*, which allow BPF programs to make queries
against cpumasks, and to allocate and store them as kptrs.
In order to enable these kfuncs, this patch set adds two new
kfunc-related capabilities to the verifier:
1. Defining a mechanism that allows developers to specify which fields
of a struct type should inherit their parent's trust. Specifically,
we specify that the 'const cpumask_t *cpus_ptr' field will be
considered trusted if the parent struct task_struct is trusted.
2. Allowing KF_TRUSTED_ARGS pointers to be walked to see if a BTF type
is equivalent to what a kfunc requires. For example, the patch set
defines the following type:
struct bpf_cpumask {
cpumask_t cpumask;
refcount_t usage;
};
cpumask_t typedefs a struct cpumask, so if a BPF program has a trusted
pointer to a struct bpf_cpumask, it would therefore be safe to pass
that to a kfunc expecting a const struct cpumask *. Note that
3. Updating the verifier to prevent NULL PTR_TO_MEM pointers to be
passed to KF_TRUSTED_ARGS kfuncs. Without this, a kfunc may crash if
it's given a pointer to what it thinks is a scalar struct, but in
reality is an address. For example, a bitmap embedded in a cpumask_t.
Following these BPF verifier changes (and their associated selftest
additions), this patchset adds a set of cpumask kfuncs in
kernel/bpf/cpumask.c, and then tests and documents them.
Lastly, note that some of the kfuncs that were added would benefit from
additional verification logic. For example, any kfunc taking a CPU
argument that exceeds the number of CPUs on the system, etc. For now, we
silently check for and ignore these cases at runtime. When we have e.g.
per-argument kfunc flags, it might be helpful to add another KF_CPU-type
flag that specifies that the verifier should validate that it's a valid
CPU.
David Vernet (8):
bpf: Enable annotating trusted nested pointers
bpf: Allow trusted args to walk struct when checking BTF IDs
bpf: Disallow NULL PTR_TO_MEM for trusted kfuncs
bpf: Enable cpumasks to be queried and used as kptrs
selftests/bpf: Add nested trust selftests suite
selftests/bpf: Add selftest suite for cpumask kfuncs
bpf/docs: Document cpumask kfuncs in a new file
bpf/docs: Document how nested trusted fields may be defined
Documentation/bpf/cpumasks.rst | 353 +++++++++++++
Documentation/bpf/index.rst | 1 +
Documentation/bpf/kfuncs.rst | 26 +-
include/linux/bpf.h | 4 +
kernel/bpf/Makefile | 1 +
kernel/bpf/btf.c | 64 ++-
kernel/bpf/cpumask.c | 476 ++++++++++++++++++
kernel/bpf/verifier.c | 67 ++-
tools/testing/selftests/bpf/DENYLIST.s390x | 2 +
.../selftests/bpf/prog_tests/cpumask.c | 74 +++
.../selftests/bpf/prog_tests/nested_trust.c | 64 +++
.../selftests/bpf/progs/cpumask_common.h | 114 +++++
.../selftests/bpf/progs/cpumask_failure.c | 125 +++++
.../selftests/bpf/progs/cpumask_success.c | 426 ++++++++++++++++
.../selftests/bpf/progs/nested_trust_common.h | 12 +
.../bpf/progs/nested_trust_failure.c | 33 ++
.../bpf/progs/nested_trust_success.c | 29 ++
17 files changed, 1865 insertions(+), 6 deletions(-)
create mode 100644 Documentation/bpf/cpumasks.rst
create mode 100644 kernel/bpf/cpumask.c
create mode 100644 tools/testing/selftests/bpf/prog_tests/cpumask.c
create mode 100644 tools/testing/selftests/bpf/prog_tests/nested_trust.c
create mode 100644 tools/testing/selftests/bpf/progs/cpumask_common.h
create mode 100644 tools/testing/selftests/bpf/progs/cpumask_failure.c
create mode 100644 tools/testing/selftests/bpf/progs/cpumask_success.c
create mode 100644 tools/testing/selftests/bpf/progs/nested_trust_common.h
create mode 100644 tools/testing/selftests/bpf/progs/nested_trust_failure.c
create mode 100644 tools/testing/selftests/bpf/progs/nested_trust_success.c
--
2.39.0
A recent patch added a new set of kfuncs for allocating, freeing,
manipulating, and querying cpumasks. This patch adds a new 'cpumask'
selftest suite which verifies their behavior.
Signed-off-by: David Vernet <[email protected]>
---
tools/testing/selftests/bpf/DENYLIST.s390x | 1 +
.../selftests/bpf/prog_tests/cpumask.c | 74 +++
.../selftests/bpf/progs/cpumask_common.h | 114 +++++
.../selftests/bpf/progs/cpumask_failure.c | 125 +++++
.../selftests/bpf/progs/cpumask_success.c | 426 ++++++++++++++++++
5 files changed, 740 insertions(+)
create mode 100644 tools/testing/selftests/bpf/prog_tests/cpumask.c
create mode 100644 tools/testing/selftests/bpf/progs/cpumask_common.h
create mode 100644 tools/testing/selftests/bpf/progs/cpumask_failure.c
create mode 100644 tools/testing/selftests/bpf/progs/cpumask_success.c
diff --git a/tools/testing/selftests/bpf/DENYLIST.s390x b/tools/testing/selftests/bpf/DENYLIST.s390x
index 1cf5b94cda30..4c2c58e9c4e5 100644
--- a/tools/testing/selftests/bpf/DENYLIST.s390x
+++ b/tools/testing/selftests/bpf/DENYLIST.s390x
@@ -13,6 +13,7 @@ cgroup_hierarchical_stats # JIT does not support calling kernel f
cgrp_kfunc # JIT does not support calling kernel function
cgrp_local_storage # prog_attach unexpected error: -524 (trampoline)
core_read_macros # unknown func bpf_probe_read#4 (overlapping)
+cpumask # JIT does not support calling kernel function
d_path # failed to auto-attach program 'prog_stat': -524 (trampoline)
decap_sanity # JIT does not support calling kernel function (kfunc)
deny_namespace # failed to attach: ERROR: strerror_r(-524)=22 (trampoline)
diff --git a/tools/testing/selftests/bpf/prog_tests/cpumask.c b/tools/testing/selftests/bpf/prog_tests/cpumask.c
new file mode 100644
index 000000000000..5fbe457c4ebe
--- /dev/null
+++ b/tools/testing/selftests/bpf/prog_tests/cpumask.c
@@ -0,0 +1,74 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright (c) 2023 Meta Platforms, Inc. and affiliates. */
+
+#include <test_progs.h>
+#include "cpumask_failure.skel.h"
+#include "cpumask_success.skel.h"
+
+static const char * const cpumask_success_testcases[] = {
+ "test_alloc_free_cpumask",
+ "test_set_clear_cpu",
+ "test_setall_clear_cpu",
+ "test_first_firstzero_cpu",
+ "test_test_and_set_clear",
+ "test_and_or_xor",
+ "test_intersects_subset",
+ "test_copy_any_anyand",
+ "test_insert_leave",
+ "test_insert_remove_release",
+ "test_insert_kptr_get_release",
+};
+
+static void verify_success(const char *prog_name)
+{
+ struct cpumask_success *skel;
+ struct bpf_program *prog;
+ struct bpf_link *link = NULL;
+ pid_t child_pid;
+ int status;
+
+ skel = cpumask_success__open();
+ if (!ASSERT_OK_PTR(skel, "cpumask_success__open"))
+ return;
+
+ skel->bss->pid = getpid();
+ skel->bss->nr_cpus = libbpf_num_possible_cpus();
+
+ cpumask_success__load(skel);
+ if (!ASSERT_OK_PTR(skel, "cpumask_success__load"))
+ goto cleanup;
+
+ prog = bpf_object__find_program_by_name(skel->obj, prog_name);
+ if (!ASSERT_OK_PTR(prog, "bpf_object__find_program_by_name"))
+ goto cleanup;
+
+ link = bpf_program__attach(prog);
+ if (!ASSERT_OK_PTR(link, "bpf_program__attach"))
+ goto cleanup;
+
+ child_pid = fork();
+ if (!ASSERT_GT(child_pid, -1, "child_pid"))
+ goto cleanup;
+ if (child_pid == 0)
+ _exit(0);
+ waitpid(child_pid, &status, 0);
+ ASSERT_OK(skel->bss->err, "post_wait_err");
+
+cleanup:
+ bpf_link__destroy(link);
+ cpumask_success__destroy(skel);
+}
+
+void test_cpumask(void)
+{
+ int i;
+
+ for (i = 0; i < ARRAY_SIZE(cpumask_success_testcases); i++) {
+ if (!test__start_subtest(cpumask_success_testcases[i]))
+ continue;
+
+ verify_success(cpumask_success_testcases[i]);
+ }
+
+ RUN_TESTS(cpumask_failure);
+}
diff --git a/tools/testing/selftests/bpf/progs/cpumask_common.h b/tools/testing/selftests/bpf/progs/cpumask_common.h
new file mode 100644
index 000000000000..ad34f3b602be
--- /dev/null
+++ b/tools/testing/selftests/bpf/progs/cpumask_common.h
@@ -0,0 +1,114 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/* Copyright (c) 2023 Meta Platforms, Inc. and affiliates. */
+
+#ifndef _CPUMASK_COMMON_H
+#define _CPUMASK_COMMON_H
+
+#include "errno.h"
+#include <stdbool.h>
+
+int err;
+
+struct __cpumask_map_value {
+ struct bpf_cpumask __kptr_ref * cpumask;
+};
+
+struct array_map {
+ __uint(type, BPF_MAP_TYPE_ARRAY);
+ __type(key, int);
+ __type(value, struct __cpumask_map_value);
+ __uint(max_entries, 1);
+} __cpumask_map SEC(".maps");
+
+struct bpf_cpumask *bpf_cpumask_create(void) __ksym;
+void bpf_cpumask_release(struct bpf_cpumask *cpumask) __ksym;
+struct bpf_cpumask *bpf_cpumask_acquire(struct bpf_cpumask *cpumask) __ksym;
+struct bpf_cpumask *bpf_cpumask_kptr_get(struct bpf_cpumask **cpumask) __ksym;
+u32 bpf_cpumask_first(const struct cpumask *cpumask) __ksym;
+u32 bpf_cpumask_first_zero(const struct cpumask *cpumask) __ksym;
+void bpf_cpumask_set_cpu(u32 cpu, struct bpf_cpumask *cpumask) __ksym;
+void bpf_cpumask_clear_cpu(u32 cpu, struct bpf_cpumask *cpumask) __ksym;
+bool bpf_cpumask_test_cpu(u32 cpu, const struct cpumask *cpumask) __ksym;
+bool bpf_cpumask_test_and_set_cpu(u32 cpu, struct bpf_cpumask *cpumask) __ksym;
+bool bpf_cpumask_test_and_clear_cpu(u32 cpu, struct bpf_cpumask *cpumask) __ksym;
+void bpf_cpumask_setall(struct bpf_cpumask *cpumask) __ksym;
+void bpf_cpumask_clear(struct bpf_cpumask *cpumask) __ksym;
+bool bpf_cpumask_and(struct bpf_cpumask *cpumask,
+ const struct cpumask *src1,
+ const struct cpumask *src2) __ksym;
+void bpf_cpumask_or(struct bpf_cpumask *cpumask,
+ const struct cpumask *src1,
+ const struct cpumask *src2) __ksym;
+void bpf_cpumask_xor(struct bpf_cpumask *cpumask,
+ const struct cpumask *src1,
+ const struct cpumask *src2) __ksym;
+bool bpf_cpumask_equal(const struct cpumask *src1, const struct cpumask *src2) __ksym;
+bool bpf_cpumask_intersects(const struct cpumask *src1, const struct cpumask *src2) __ksym;
+bool bpf_cpumask_subset(const struct cpumask *src1, const struct cpumask *src2) __ksym;
+bool bpf_cpumask_empty(const struct cpumask *cpumask) __ksym;
+bool bpf_cpumask_full(const struct cpumask *cpumask) __ksym;
+void bpf_cpumask_copy(struct bpf_cpumask *dst, const struct cpumask *src) __ksym;
+u32 bpf_cpumask_any(const struct cpumask *src) __ksym;
+u32 bpf_cpumask_any_and(const struct cpumask *src1, const struct cpumask *src2) __ksym;
+
+static inline const struct cpumask *cast(struct bpf_cpumask *cpumask)
+{
+ return (const struct cpumask *)cpumask;
+}
+
+static inline struct bpf_cpumask *create_cpumask(void)
+{
+ struct bpf_cpumask *cpumask;
+
+ cpumask = bpf_cpumask_create();
+ if (!cpumask) {
+ err = 1;
+ return NULL;
+ }
+
+ if (!bpf_cpumask_empty(cast(cpumask))) {
+ err = 2;
+ bpf_cpumask_release(cpumask);
+ return NULL;
+ }
+
+ return cpumask;
+}
+
+static inline struct __cpumask_map_value *cpumask_map_value_lookup(void)
+{
+ u32 key = 0;
+
+ return bpf_map_lookup_elem(&__cpumask_map, &key);
+}
+
+static inline int cpumask_map_insert(struct bpf_cpumask *mask)
+{
+ struct __cpumask_map_value local, *v;
+ long status;
+ struct bpf_cpumask *old;
+ u32 key = 0;
+
+ local.cpumask = NULL;
+ status = bpf_map_update_elem(&__cpumask_map, &key, &local, 0);
+ if (status) {
+ bpf_cpumask_release(mask);
+ return status;
+ }
+
+ v = bpf_map_lookup_elem(&__cpumask_map, &key);
+ if (!v) {
+ bpf_cpumask_release(mask);
+ return -ENOENT;
+ }
+
+ old = bpf_kptr_xchg(&v->cpumask, mask);
+ if (old) {
+ bpf_cpumask_release(old);
+ return -EEXIST;
+ }
+
+ return 0;
+}
+
+#endif /* _CPUMASK_COMMON_H */
diff --git a/tools/testing/selftests/bpf/progs/cpumask_failure.c b/tools/testing/selftests/bpf/progs/cpumask_failure.c
new file mode 100644
index 000000000000..8a6ac7a91e92
--- /dev/null
+++ b/tools/testing/selftests/bpf/progs/cpumask_failure.c
@@ -0,0 +1,125 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright (c) 2023 Meta Platforms, Inc. and affiliates. */
+
+#include <vmlinux.h>
+#include <bpf/bpf_tracing.h>
+#include <bpf/bpf_helpers.h>
+#include "bpf_misc.h"
+
+#include "cpumask_common.h"
+
+char _license[] SEC("license") = "GPL";
+
+/* Prototype for all of the program trace events below:
+ *
+ * TRACE_EVENT(task_newtask,
+ * TP_PROTO(struct task_struct *p, u64 clone_flags)
+ */
+
+SEC("tp_btf/task_newtask")
+__failure __msg("Unreleased reference")
+int BPF_PROG(test_alloc_no_release, struct task_struct *task, u64 clone_flags)
+{
+ struct bpf_cpumask *cpumask;
+
+ cpumask = create_cpumask();
+
+ /* cpumask is never released. */
+ return 0;
+}
+
+SEC("tp_btf/task_newtask")
+__failure __msg("NULL pointer passed to trusted arg0")
+int BPF_PROG(test_alloc_double_release, struct task_struct *task, u64 clone_flags)
+{
+ struct bpf_cpumask *cpumask;
+
+ cpumask = create_cpumask();
+
+ /* cpumask is released twice. */
+ bpf_cpumask_release(cpumask);
+ bpf_cpumask_release(cpumask);
+
+ return 0;
+}
+
+SEC("tp_btf/task_newtask")
+__failure __msg("bpf_cpumask_acquire args#0 expected pointer to STRUCT bpf_cpumask")
+int BPF_PROG(test_acquire_wrong_cpumask, struct task_struct *task, u64 clone_flags)
+{
+ struct bpf_cpumask *cpumask;
+
+ /* Can't acquire a non-struct bpf_cpumask. */
+ cpumask = bpf_cpumask_acquire((struct bpf_cpumask *)task->cpus_ptr);
+
+ return 0;
+}
+
+SEC("tp_btf/task_newtask")
+__failure __msg("bpf_cpumask_set_cpu args#1 expected pointer to STRUCT bpf_cpumask")
+int BPF_PROG(test_mutate_cpumask, struct task_struct *task, u64 clone_flags)
+{
+ struct bpf_cpumask *cpumask;
+
+ /* Can't set the CPU of a non-struct bpf_cpumask. */
+ bpf_cpumask_set_cpu(0, (struct bpf_cpumask *)task->cpus_ptr);
+
+ return 0;
+}
+
+SEC("tp_btf/task_newtask")
+__failure __msg("Unreleased reference")
+int BPF_PROG(test_insert_remove_no_release, struct task_struct *task, u64 clone_flags)
+{
+ struct bpf_cpumask *cpumask;
+ struct __cpumask_map_value *v;
+
+ cpumask = create_cpumask();
+ if (!cpumask)
+ return 0;
+
+ if (cpumask_map_insert(cpumask))
+ return 0;
+
+ v = cpumask_map_value_lookup();
+ if (!v)
+ return 0;
+
+ cpumask = bpf_kptr_xchg(&v->cpumask, NULL);
+
+ /* cpumask is never released. */
+ return 0;
+}
+
+SEC("tp_btf/task_newtask")
+__failure __msg("Unreleased reference")
+int BPF_PROG(test_kptr_get_no_release, struct task_struct *task, u64 clone_flags)
+{
+ struct bpf_cpumask *cpumask;
+ struct __cpumask_map_value *v;
+
+ cpumask = create_cpumask();
+ if (!cpumask)
+ return 0;
+
+ if (cpumask_map_insert(cpumask))
+ return 0;
+
+ v = cpumask_map_value_lookup();
+ if (!v)
+ return 0;
+
+ cpumask = bpf_cpumask_kptr_get(&v->cpumask);
+
+ /* cpumask is never released. */
+ return 0;
+}
+
+SEC("tp_btf/task_newtask")
+__failure __msg("NULL pointer passed to trusted arg0")
+int BPF_PROG(test_cpumask_null, struct task_struct *task, u64 clone_flags)
+{
+ bpf_cpumask_empty(NULL);
+
+ return 0;
+}
diff --git a/tools/testing/selftests/bpf/progs/cpumask_success.c b/tools/testing/selftests/bpf/progs/cpumask_success.c
new file mode 100644
index 000000000000..1d38bc65d4b0
--- /dev/null
+++ b/tools/testing/selftests/bpf/progs/cpumask_success.c
@@ -0,0 +1,426 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright (c) 2023 Meta Platforms, Inc. and affiliates. */
+
+#include <vmlinux.h>
+#include <bpf/bpf_tracing.h>
+#include <bpf/bpf_helpers.h>
+
+#include "cpumask_common.h"
+
+char _license[] SEC("license") = "GPL";
+
+int pid, nr_cpus;
+
+static bool is_test_task(void)
+{
+ int cur_pid = bpf_get_current_pid_tgid() >> 32;
+
+ return pid == cur_pid;
+}
+
+static bool create_cpumask_set(struct bpf_cpumask **out1,
+ struct bpf_cpumask **out2,
+ struct bpf_cpumask **out3,
+ struct bpf_cpumask **out4)
+{
+ struct bpf_cpumask *mask1, *mask2, *mask3, *mask4;
+
+ mask1 = create_cpumask();
+ if (!mask1)
+ return false;
+
+ mask2 = create_cpumask();
+ if (!mask2) {
+ bpf_cpumask_release(mask1);
+ err = 3;
+ return false;
+ }
+
+ mask3 = create_cpumask();
+ if (!mask3) {
+ bpf_cpumask_release(mask1);
+ bpf_cpumask_release(mask2);
+ err = 4;
+ return false;
+ }
+
+ mask4 = create_cpumask();
+ if (!mask4) {
+ bpf_cpumask_release(mask1);
+ bpf_cpumask_release(mask2);
+ bpf_cpumask_release(mask3);
+ err = 5;
+ return false;
+ }
+
+ *out1 = mask1;
+ *out2 = mask2;
+ *out3 = mask3;
+ *out4 = mask4;
+
+ return true;
+}
+
+SEC("tp_btf/task_newtask")
+int BPF_PROG(test_alloc_free_cpumask, struct task_struct *task, u64 clone_flags)
+{
+ struct bpf_cpumask *cpumask;
+
+ if (!is_test_task())
+ return 0;
+
+ cpumask = create_cpumask();
+ if (!cpumask)
+ return 0;
+
+ bpf_cpumask_release(cpumask);
+ return 0;
+}
+
+SEC("tp_btf/task_newtask")
+int BPF_PROG(test_set_clear_cpu, struct task_struct *task, u64 clone_flags)
+{
+ struct bpf_cpumask *cpumask;
+
+ if (!is_test_task())
+ return 0;
+
+ cpumask = create_cpumask();
+ if (!cpumask)
+ return 0;
+
+ bpf_cpumask_set_cpu(0, cpumask);
+ if (!bpf_cpumask_test_cpu(0, cast(cpumask))) {
+ err = 3;
+ goto release_exit;
+ }
+
+ bpf_cpumask_clear_cpu(0, cpumask);
+ if (bpf_cpumask_test_cpu(0, cast(cpumask))) {
+ err = 4;
+ goto release_exit;
+ }
+
+release_exit:
+ bpf_cpumask_release(cpumask);
+ return 0;
+}
+
+SEC("tp_btf/task_newtask")
+int BPF_PROG(test_setall_clear_cpu, struct task_struct *task, u64 clone_flags)
+{
+ struct bpf_cpumask *cpumask;
+
+ if (!is_test_task())
+ return 0;
+
+ cpumask = create_cpumask();
+ if (!cpumask)
+ return 0;
+
+ bpf_cpumask_setall(cpumask);
+ if (!bpf_cpumask_full(cast(cpumask))) {
+ err = 3;
+ goto release_exit;
+ }
+
+ bpf_cpumask_clear(cpumask);
+ if (!bpf_cpumask_empty(cast(cpumask))) {
+ err = 4;
+ goto release_exit;
+ }
+
+release_exit:
+ bpf_cpumask_release(cpumask);
+ return 0;
+}
+
+SEC("tp_btf/task_newtask")
+int BPF_PROG(test_first_firstzero_cpu, struct task_struct *task, u64 clone_flags)
+{
+ struct bpf_cpumask *cpumask;
+
+ if (!is_test_task())
+ return 0;
+
+ cpumask = create_cpumask();
+ if (!cpumask)
+ return 0;
+
+ if (bpf_cpumask_first(cast(cpumask)) < nr_cpus) {
+ err = 3;
+ goto release_exit;
+ }
+
+ if (bpf_cpumask_first_zero(cast(cpumask)) != 0) {
+ bpf_printk("first zero: %d", bpf_cpumask_first_zero(cast(cpumask)));
+ err = 4;
+ goto release_exit;
+ }
+
+ bpf_cpumask_set_cpu(0, cpumask);
+ if (bpf_cpumask_first(cast(cpumask)) != 0) {
+ err = 5;
+ goto release_exit;
+ }
+
+ if (bpf_cpumask_first_zero(cast(cpumask)) != 1) {
+ err = 6;
+ goto release_exit;
+ }
+
+release_exit:
+ bpf_cpumask_release(cpumask);
+ return 0;
+}
+
+SEC("tp_btf/task_newtask")
+int BPF_PROG(test_test_and_set_clear, struct task_struct *task, u64 clone_flags)
+{
+ struct bpf_cpumask *cpumask;
+
+ if (!is_test_task())
+ return 0;
+
+ cpumask = create_cpumask();
+ if (!cpumask)
+ return 0;
+
+ if (bpf_cpumask_test_and_set_cpu(0, cpumask)) {
+ err = 3;
+ goto release_exit;
+ }
+
+ if (!bpf_cpumask_test_and_set_cpu(0, cpumask)) {
+ err = 4;
+ goto release_exit;
+ }
+
+ if (!bpf_cpumask_test_and_clear_cpu(0, cpumask)) {
+ err = 5;
+ goto release_exit;
+ }
+
+release_exit:
+ bpf_cpumask_release(cpumask);
+ return 0;
+}
+
+SEC("tp_btf/task_newtask")
+int BPF_PROG(test_and_or_xor, struct task_struct *task, u64 clone_flags)
+{
+ struct bpf_cpumask *mask1, *mask2, *dst1, *dst2;
+
+ if (!is_test_task())
+ return 0;
+
+ if (!create_cpumask_set(&mask1, &mask2, &dst1, &dst2))
+ return 0;
+
+ bpf_cpumask_set_cpu(0, mask1);
+ bpf_cpumask_set_cpu(1, mask2);
+
+ if (bpf_cpumask_and(dst1, cast(mask1), cast(mask2))) {
+ err = 6;
+ goto release_exit;
+ }
+ if (!bpf_cpumask_empty(cast(dst1))) {
+ err = 7;
+ goto release_exit;
+ }
+
+ bpf_cpumask_or(dst1, cast(mask1), cast(mask2));
+ if (!bpf_cpumask_test_cpu(0, cast(dst1))) {
+ err = 8;
+ goto release_exit;
+ }
+ if (!bpf_cpumask_test_cpu(1, cast(dst1))) {
+ err = 9;
+ goto release_exit;
+ }
+
+ bpf_cpumask_xor(dst2, cast(mask1), cast(mask2));
+ if (!bpf_cpumask_equal(cast(dst1), cast(dst2))) {
+ err = 10;
+ goto release_exit;
+ }
+
+release_exit:
+ bpf_cpumask_release(mask1);
+ bpf_cpumask_release(mask2);
+ bpf_cpumask_release(dst1);
+ bpf_cpumask_release(dst2);
+ return 0;
+}
+
+SEC("tp_btf/task_newtask")
+int BPF_PROG(test_intersects_subset, struct task_struct *task, u64 clone_flags)
+{
+ struct bpf_cpumask *mask1, *mask2, *dst1, *dst2;
+
+ if (!is_test_task())
+ return 0;
+
+ if (!create_cpumask_set(&mask1, &mask2, &dst1, &dst2))
+ return 0;
+
+ bpf_cpumask_set_cpu(0, mask1);
+ bpf_cpumask_set_cpu(1, mask2);
+ if (bpf_cpumask_intersects(cast(mask1), cast(mask2))) {
+ err = 6;
+ goto release_exit;
+ }
+
+ bpf_cpumask_or(dst1, cast(mask1), cast(mask2));
+ if (!bpf_cpumask_subset(cast(mask1), cast(dst1))) {
+ err = 7;
+ goto release_exit;
+ }
+
+ if (!bpf_cpumask_subset(cast(mask2), cast(dst1))) {
+ err = 8;
+ goto release_exit;
+ }
+
+ if (bpf_cpumask_subset(cast(dst1), cast(mask1))) {
+ err = 9;
+ goto release_exit;
+ }
+
+release_exit:
+ bpf_cpumask_release(mask1);
+ bpf_cpumask_release(mask2);
+ bpf_cpumask_release(dst1);
+ bpf_cpumask_release(dst2);
+ return 0;
+}
+
+SEC("tp_btf/task_newtask")
+int BPF_PROG(test_copy_any_anyand, struct task_struct *task, u64 clone_flags)
+{
+ struct bpf_cpumask *mask1, *mask2, *dst1, *dst2;
+ u32 cpu;
+
+ if (!is_test_task())
+ return 0;
+
+ if (!create_cpumask_set(&mask1, &mask2, &dst1, &dst2))
+ return 0;
+
+ bpf_cpumask_set_cpu(0, mask1);
+ bpf_cpumask_set_cpu(1, mask2);
+ bpf_cpumask_or(dst1, cast(mask1), cast(mask2));
+
+ cpu = bpf_cpumask_any(cast(mask1));
+ if (cpu != 0) {
+ err = 6;
+ goto release_exit;
+ }
+
+ cpu = bpf_cpumask_any(cast(dst2));
+ if (cpu < nr_cpus) {
+ err = 7;
+ goto release_exit;
+ }
+
+ bpf_cpumask_copy(dst2, cast(dst1));
+ if (!bpf_cpumask_equal(cast(dst1), cast(dst2))) {
+ err = 8;
+ goto release_exit;
+ }
+
+ cpu = bpf_cpumask_any(cast(dst2));
+ if (cpu > 1) {
+ err = 9;
+ goto release_exit;
+ }
+
+ cpu = bpf_cpumask_any_and(cast(mask1), cast(mask2));
+ if (cpu < nr_cpus) {
+ err = 10;
+ goto release_exit;
+ }
+
+release_exit:
+ bpf_cpumask_release(mask1);
+ bpf_cpumask_release(mask2);
+ bpf_cpumask_release(dst1);
+ bpf_cpumask_release(dst2);
+ return 0;
+}
+
+SEC("tp_btf/task_newtask")
+int BPF_PROG(test_insert_leave, struct task_struct *task, u64 clone_flags)
+{
+ struct bpf_cpumask *cpumask;
+ struct __cpumask_map_value *v;
+
+ cpumask = create_cpumask();
+ if (!cpumask)
+ return 0;
+
+ if (cpumask_map_insert(cpumask))
+ err = 3;
+
+ return 0;
+}
+
+SEC("tp_btf/task_newtask")
+int BPF_PROG(test_insert_remove_release, struct task_struct *task, u64 clone_flags)
+{
+ struct bpf_cpumask *cpumask;
+ struct __cpumask_map_value *v;
+
+ cpumask = create_cpumask();
+ if (!cpumask)
+ return 0;
+
+ if (cpumask_map_insert(cpumask)) {
+ err = 3;
+ return 0;
+ }
+
+ v = cpumask_map_value_lookup();
+ if (!v) {
+ err = 4;
+ return 0;
+ }
+
+ cpumask = bpf_kptr_xchg(&v->cpumask, NULL);
+ if (cpumask)
+ bpf_cpumask_release(cpumask);
+ else
+ err = 5;
+
+ return 0;
+}
+
+SEC("tp_btf/task_newtask")
+int BPF_PROG(test_insert_kptr_get_release, struct task_struct *task, u64 clone_flags)
+{
+ struct bpf_cpumask *cpumask;
+ struct __cpumask_map_value *v;
+
+ cpumask = create_cpumask();
+ if (!cpumask)
+ return 0;
+
+ if (cpumask_map_insert(cpumask)) {
+ err = 3;
+ return 0;
+ }
+
+ v = cpumask_map_value_lookup();
+ if (!v) {
+ err = 4;
+ return 0;
+ }
+
+ cpumask = bpf_cpumask_kptr_get(&v->cpumask);
+ if (cpumask)
+ bpf_cpumask_release(cpumask);
+ else
+ err = 5;
+
+ return 0;
+}
--
2.39.0
KF_TRUSTED_ARGS kfuncs currently have a subtle and insidious bug in
validating pointers to scalars. Say that you have a kfunc like the
following, which takes an array as the first argument:
bool bpf_cpumask_empty(const struct cpumask *cpumask)
{
return cpumask_empty(cpumask);
}
...
BTF_ID_FLAGS(func, bpf_cpumask_empty, KF_TRUSTED_ARGS)
...
If a BPF program were to invoke the kfunc with a NULL argument, it would
crash the kernel. The reason is that struct cpumask is defined as a
bitmap, which is itself defined as an array, and is accessed as a memory
address memory by bitmap operations. So when the verifier analyzes the
register, it interprets it as a pointer to a scalar struct, which is an
array of size 8. check_mem_reg() then sees that the register is NULL,
and returns 0, and the kfunc crashes when it passes it down to the
cpumask wrappers.
To fix this, this patch adds a check for KF_ARG_PTR_TO_MEM which
verifies that the register doesn't contain a NULL pointer if the kfunc
is KF_TRUSTED_ARGS.
This may or may not be desired behavior. Some kfuncs may want to
allow callers to pass NULL-able pointers. An alternative would be adding
a KF_NOT_NULL flag and leaving KF_TRUSTED_ARGS alone, though given that
a kfunc is saying it wants to "trust" an argument, it seems reasonable
to prevent NULL.
Signed-off-by: David Vernet <[email protected]>
---
kernel/bpf/verifier.c | 5 +++++
1 file changed, 5 insertions(+)
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 9fa101420046..28ccb92ebe65 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -9092,6 +9092,11 @@ static int check_kfunc_args(struct bpf_verifier_env *env, struct bpf_kfunc_call_
i, btf_type_str(ref_t), ref_tname, PTR_ERR(resolve_ret));
return -EINVAL;
}
+ if (is_kfunc_trusted_args(meta) && register_is_null(reg)) {
+ verbose(env, "NULL pointer passed to trusted arg%d\n", i);
+ return -EACCES;
+ }
+
ret = check_mem_reg(env, reg, regno, type_size);
if (ret < 0)
return ret;
--
2.39.0
A prior change defined a new BTF_TYPE_SAFE_NESTED macro in the verifier
which allows developers to specify when a pointee field in a struct type
should inherit its parent pointer's trusted status. This patch updates
the kfuncs documentation to specify this macro and how it can be used.
Signed-off-by: David Vernet <[email protected]>
---
Documentation/bpf/kfuncs.rst | 22 +++++++++++++++++++++-
1 file changed, 21 insertions(+), 1 deletion(-)
diff --git a/Documentation/bpf/kfuncs.rst b/Documentation/bpf/kfuncs.rst
index a74f9e74087b..560f4ede3a9f 100644
--- a/Documentation/bpf/kfuncs.rst
+++ b/Documentation/bpf/kfuncs.rst
@@ -167,7 +167,8 @@ KF_ACQUIRE and KF_RET_NULL flags.
The KF_TRUSTED_ARGS flag is used for kfuncs taking pointer arguments. It
indicates that the all pointer arguments are valid, and that all pointers to
BTF objects have been passed in their unmodified form (that is, at a zero
-offset, and without having been obtained from walking another pointer).
+offset, and without having been obtained from walking another pointer, with one
+exception described below).
There are two types of pointers to kernel objects which are considered "valid":
@@ -180,6 +181,25 @@ KF_TRUSTED_ARGS kfuncs, and may have a non-zero offset.
The definition of "valid" pointers is subject to change at any time, and has
absolutely no ABI stability guarantees.
+As mentioned above, a nested pointer obtained from walking a trusted pointer is
+no longer trusted, with one exception. If a struct type has a field that is
+guaranteed to be valid as long as its parent pointer is trusted, the
+``BTF_TYPE_SAFE_NESTED`` macro can be used to express that to the verifier as
+follows:
+
+.. code-block:: c
+
+ BTF_TYPE_SAFE_NESTED(struct task_struct) {
+ const cpumask_t *cpus_ptr;
+ };
+
+In other words, you must:
+
+1. Wrap the trusted pointer type in the ``BTF_TYPE_SAFE_NESTED`` macro.
+
+2. Specify the type and name of the trusted nested field. This field must match
+ the field in the original type definition exactly.
+
2.4.6 KF_SLEEPABLE flag
-----------------------
--
2.39.0
When validating BTF types for KF_TRUSTED_ARGS kfuncs, the verifier
currently enforces that the top-level type must match when calling
the kfunc. In other words, the verifier does not allow the BPF program
to pass a bitwise equivalent struct, despite it being functionally safe.
For example, if you have the following type:
struct nf_conn___init {
struct nf_conn ct;
};
It would be safe to pass a struct nf_conn___init to a kfunc expecting a
struct nf_conn. Being able to do this will be useful for certain types
of kfunc / kptrs enabled by BPF. For example, in a follow-on patch, a
series of kfuncs will be added which allow programs to do bitwise
queries on cpumasks that are either allocated by the program (in which
case they'll be a 'struct bpf_cpumask' type that wraps a cpumask_t as
its first element), or a cpumask that was allocated by the main kernel
(in which case it will just be a straight cpumask_t, as in
task->cpus_ptr).
Having the two types of cpumasks allows us to distinguish between the
two for when a cpumask is read-only vs. mutatable. A struct bpf_cpumask
can be mutated by e.g. bpf_cpumask_clear(), whereas a regular cpumask_t
cannot be. On the other hand, a struct bpf_cpumask can of course be
queried in the exact same manner as a cpumask_t, with e.g.
bpf_cpumask_test_cpu().
If we were to enforce that top level types match, then a user that's
passing a struct bpf_cpumask to a read-only cpumask_t argument would
have to cast with something like bpf_cast_to_kern_ctx() (which itself
would need to be updated to expect the alias, and currently it only
accommodates a single alias per prog type). Additionally, not specifying
KF_TRUSTED_ARGS is not an option, as some kfuncs take one argument as a
struct bpf_cpumask *, and another as a struct cpumask *
(i.e. cpumask_t).
In order to enable this, this patch relaxes the constraint that a
KF_TRUSTED_ARGS kfunc must have strict type matching. In order to
try and be conservative and match existing behavior / expectations, this
patch also enforces strict type checking for acquire kfuncs. We were
already enforcing it for release kfuncs, so this should also improve the
consistency of the semantics for kfuncs.
Signed-off-by: David Vernet <[email protected]>
---
kernel/bpf/verifier.c | 27 ++++++++++++++++++++++++++-
1 file changed, 26 insertions(+), 1 deletion(-)
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 7f973847b58e..9fa101420046 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -8563,9 +8563,34 @@ static int process_kf_arg_ptr_to_btf_id(struct bpf_verifier_env *env,
reg_ref_id = *reg2btf_ids[base_type(reg->type)];
}
- if (is_kfunc_trusted_args(meta) || (is_kfunc_release(meta) && reg->ref_obj_id))
+ /* Enforce strict type matching for calls to kfuncs that are acquiring
+ * or releasing a reference. We do _not_ enforce strict matching for
+ * plain KF_TRUSTED_ARGS kfuncs, as we want to enable BPF programs to
+ * pass types that are bitwise equivalent without forcing them to
+ * explicitly cast with something like bpf_cast_to_kern_ctx().
+ *
+ * For example, say we had a type like the following:
+ *
+ * struct bpf_cpumask {
+ * cpumask_t cpumask;
+ * refcount_t usage;
+ * };
+ *
+ * Note that as specified in <linux/cpumask.h>, cpumask_t is typedef'ed
+ * to a struct cpumask, so it would be safe to pass a struct
+ * bpf_cpumask * to a kfunc expecting a struct cpumask *.
+ *
+ * The philosophy here is similar to how we allow scalars of different
+ * types to be passed to kfuncs as long as the size is the same. The
+ * only difference here is that we're simply allowing
+ * btf_struct_ids_match() to walk the struct at the 0th offset, and
+ * resolve types.
+ */
+ if (is_kfunc_acquire(meta) || (is_kfunc_release(meta) && reg->ref_obj_id))
strict_type_match = true;
+ WARN_ON_ONCE(is_kfunc_trusted_args(meta) && reg->off);
+
reg_ref_t = btf_type_skip_modifiers(reg_btf, reg_ref_id, ®_ref_id);
reg_ref_tname = btf_name_by_offset(reg_btf, reg_ref_t->name_off);
if (!btf_struct_ids_match(&env->log, reg_btf, reg_ref_id, reg->off, meta->btf, ref_id, strict_type_match)) {
--
2.39.0
Now that defining trusted fields in a struct is supported, we should add
selftests to verify the behavior. This patch adds a few such testcases.
Signed-off-by: David Vernet <[email protected]>
---
tools/testing/selftests/bpf/DENYLIST.s390x | 1 +
.../selftests/bpf/prog_tests/nested_trust.c | 64 +++++++++++++++++++
.../selftests/bpf/progs/nested_trust_common.h | 12 ++++
.../bpf/progs/nested_trust_failure.c | 33 ++++++++++
.../bpf/progs/nested_trust_success.c | 29 +++++++++
5 files changed, 139 insertions(+)
create mode 100644 tools/testing/selftests/bpf/prog_tests/nested_trust.c
create mode 100644 tools/testing/selftests/bpf/progs/nested_trust_common.h
create mode 100644 tools/testing/selftests/bpf/progs/nested_trust_failure.c
create mode 100644 tools/testing/selftests/bpf/progs/nested_trust_success.c
diff --git a/tools/testing/selftests/bpf/DENYLIST.s390x b/tools/testing/selftests/bpf/DENYLIST.s390x
index 96e8371f5c2a..1cf5b94cda30 100644
--- a/tools/testing/selftests/bpf/DENYLIST.s390x
+++ b/tools/testing/selftests/bpf/DENYLIST.s390x
@@ -44,6 +44,7 @@ map_kptr # failed to open_and_load program: -524
modify_return # modify_return attach failed: -524 (trampoline)
module_attach # skel_attach skeleton attach failed: -524 (trampoline)
mptcp
+nested_trust # JIT does not support calling kernel function
netcnt # failed to load BPF skeleton 'netcnt_prog': -7 (?)
probe_user # check_kprobe_res wrong kprobe res from probe read (?)
rcu_read_lock # failed to find kernel BTF type ID of '__x64_sys_getpgid': -3 (?)
diff --git a/tools/testing/selftests/bpf/prog_tests/nested_trust.c b/tools/testing/selftests/bpf/prog_tests/nested_trust.c
new file mode 100644
index 000000000000..4d13612f5001
--- /dev/null
+++ b/tools/testing/selftests/bpf/prog_tests/nested_trust.c
@@ -0,0 +1,64 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright (c) 2023 Meta Platforms, Inc. and affiliates. */
+
+#include <test_progs.h>
+#include "nested_trust_failure.skel.h"
+#include "nested_trust_success.skel.h"
+
+static const char * const nested_trust_success_testcases[] = {
+ "test_read_cpumask",
+};
+
+static void verify_success(const char *prog_name)
+{
+ struct nested_trust_success *skel;
+ struct bpf_program *prog;
+ struct bpf_link *link = NULL;
+ int status;
+ pid_t child_pid;
+
+ skel = nested_trust_success__open();
+ if (!ASSERT_OK_PTR(skel, "nested_trust_success__open"))
+ return;
+
+ skel->bss->pid = getpid();
+
+ nested_trust_success__load(skel);
+ if (!ASSERT_OK_PTR(skel, "nested_trust_success__load"))
+ goto cleanup;
+
+ prog = bpf_object__find_program_by_name(skel->obj, prog_name);
+ if (!ASSERT_OK_PTR(prog, "bpf_object__find_program_by_name"))
+ goto cleanup;
+
+ link = bpf_program__attach(prog);
+ if (!ASSERT_OK_PTR(link, "bpf_program__attach"))
+ goto cleanup;
+
+ child_pid = fork();
+ if (!ASSERT_GT(child_pid, -1, "child_pid"))
+ goto cleanup;
+ if (child_pid == 0)
+ _exit(0);
+ waitpid(child_pid, &status, 0);
+ ASSERT_OK(skel->bss->err, "post_wait_err");
+
+ bpf_link__destroy(link);
+
+cleanup:
+ nested_trust_success__destroy(skel);
+}
+
+void test_nested_trust(void)
+{
+ int i;
+
+ for (i = 0; i < ARRAY_SIZE(nested_trust_success_testcases); i++) {
+ if (!test__start_subtest(nested_trust_success_testcases[i]))
+ continue;
+
+ verify_success(nested_trust_success_testcases[i]);
+ }
+
+ RUN_TESTS(nested_trust_failure);
+}
diff --git a/tools/testing/selftests/bpf/progs/nested_trust_common.h b/tools/testing/selftests/bpf/progs/nested_trust_common.h
new file mode 100644
index 000000000000..83d33931136e
--- /dev/null
+++ b/tools/testing/selftests/bpf/progs/nested_trust_common.h
@@ -0,0 +1,12 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/* Copyright (c) 2023 Meta Platforms, Inc. and affiliates. */
+
+#ifndef _NESTED_TRUST_COMMON_H
+#define _NESTED_TRUST_COMMON_H
+
+#include <stdbool.h>
+
+bool bpf_cpumask_test_cpu(unsigned int cpu, const struct cpumask *cpumask) __ksym;
+bool bpf_cpumask_first_zero(const struct cpumask *cpumask) __ksym;
+
+#endif /* _NESTED_TRUST_COMMON_H */
diff --git a/tools/testing/selftests/bpf/progs/nested_trust_failure.c b/tools/testing/selftests/bpf/progs/nested_trust_failure.c
new file mode 100644
index 000000000000..14aff7676436
--- /dev/null
+++ b/tools/testing/selftests/bpf/progs/nested_trust_failure.c
@@ -0,0 +1,33 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright (c) 2023 Meta Platforms, Inc. and affiliates. */
+
+#include <vmlinux.h>
+#include <bpf/bpf_tracing.h>
+#include <bpf/bpf_helpers.h>
+#include "bpf_misc.h"
+
+#include "nested_trust_common.h"
+
+char _license[] SEC("license") = "GPL";
+
+/* Prototype for all of the program trace events below:
+ *
+ * TRACE_EVENT(task_newtask,
+ * TP_PROTO(struct task_struct *p, u64 clone_flags)
+ */
+
+SEC("tp_btf/task_newtask")
+__failure __msg("R2 must be referenced or trusted")
+int BPF_PROG(test_invalid_nested_user_cpus, struct task_struct *task, u64 clone_flags)
+{
+ bpf_cpumask_test_cpu(0, task->user_cpus_ptr);
+ return 0;
+}
+
+SEC("tp_btf/task_newtask")
+__failure __msg("R1 must have zero offset when passed to release func or trusted arg to kfunc")
+int BPF_PROG(test_invalid_nested_offset, struct task_struct *task, u64 clone_flags)
+{
+ bpf_cpumask_first_zero(&task->cpus_mask);
+ return 0;
+}
diff --git a/tools/testing/selftests/bpf/progs/nested_trust_success.c b/tools/testing/selftests/bpf/progs/nested_trust_success.c
new file mode 100644
index 000000000000..04079f120bea
--- /dev/null
+++ b/tools/testing/selftests/bpf/progs/nested_trust_success.c
@@ -0,0 +1,29 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright (c) 2023 Meta Platforms, Inc. and affiliates. */
+
+#include <vmlinux.h>
+#include <bpf/bpf_tracing.h>
+#include <bpf/bpf_helpers.h>
+
+#include "nested_trust_common.h"
+
+char _license[] SEC("license") = "GPL";
+
+int pid, err;
+
+static bool is_test_task(void)
+{
+ int cur_pid = bpf_get_current_pid_tgid() >> 32;
+
+ return pid == cur_pid;
+}
+
+SEC("tp_btf/task_newtask")
+int BPF_PROG(test_read_cpumask, struct task_struct *task, u64 clone_flags)
+{
+ if (!is_test_task())
+ return 0;
+
+ bpf_cpumask_test_cpu(0, task->cpus_ptr);
+ return 0;
+}
--
2.39.0
In kfuncs, a "trusted" pointer is a pointer that the kfunc can assume is
safe, and which the verifier will allow to be passed to a
KF_TRUSTED_ARGS kfunc. Currently, a KF_TRUSTED_ARGS kfunc disallows any
pointer to be passed at a nonzero offset, but sometimes this is in fact
safe if the "nested" pointer's lifetime is inherited from its parent.
For example, the const cpumask_t *cpus_ptr field in a struct task_struct
will remain valid until the task itself is destroyed, and thus would
also be safe to pass to a KF_TRUSTED_ARGS kfunc.
While it would be conceptually simple to enable this by using BTF tags,
gcc unfortunately does not yet support this. In the interim, this patch
enables support for this by using a type-naming convention. A new
BTF_TYPE_SAFE_NESTED macro is defined in verifier.c which allows a
developer to specify the nested fields of a type which are considered
trusted if its parent is also trusted. The verifier is also updated to
account for this. A patch with selftests will be added in a follow-on
change, along with documentation for this feature.
Signed-off-by: David Vernet <[email protected]>
---
include/linux/bpf.h | 4 +++
kernel/bpf/btf.c | 64 ++++++++++++++++++++++++++++++++++++++++++-
kernel/bpf/verifier.c | 32 ++++++++++++++++++++--
3 files changed, 96 insertions(+), 4 deletions(-)
diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index ae7771c7d750..283e96e5b228 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -2186,6 +2186,10 @@ struct bpf_core_ctx {
const struct btf *btf;
};
+bool btf_nested_type_is_trusted(struct bpf_verifier_log *log,
+ const struct bpf_reg_state *reg,
+ int off);
+
int bpf_core_apply(struct bpf_core_ctx *ctx, const struct bpf_core_relo *relo,
int relo_idx, void *insn);
diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c
index 4ba749fcce9d..fcd66edc22af 100644
--- a/kernel/bpf/btf.c
+++ b/kernel/bpf/btf.c
@@ -24,6 +24,7 @@
#include <linux/perf_event.h>
#include <linux/bsearch.h>
#include <linux/kobject.h>
+#include <linux/stringify.h>
#include <linux/sysfs.h>
#include <net/sock.h>
#include "../tools/lib/bpf/relo_core.h"
@@ -529,7 +530,7 @@ s32 btf_find_by_name_kind(const struct btf *btf, const char *name, u8 kind)
return -ENOENT;
}
-static s32 bpf_find_btf_id(const char *name, u32 kind, struct btf **btf_p)
+s32 bpf_find_btf_id(const char *name, u32 kind, struct btf **btf_p)
{
struct btf *btf;
s32 ret;
@@ -8227,3 +8228,64 @@ int bpf_core_apply(struct bpf_core_ctx *ctx, const struct bpf_core_relo *relo,
}
return err;
}
+
+bool btf_nested_type_is_trusted(struct bpf_verifier_log *log,
+ const struct bpf_reg_state *reg,
+ int off)
+{
+ struct btf *btf = reg->btf;
+ const struct btf_type *walk_type, *safe_type;
+ const char *tname;
+ char safe_tname[64];
+ long ret, safe_id;
+ const struct btf_member *member, *m_walk = NULL;
+ u32 i;
+ const char *walk_name;
+
+ walk_type = btf_type_by_id(btf, reg->btf_id);
+ if (!walk_type)
+ return false;
+
+ tname = btf_name_by_offset(btf, walk_type->name_off);
+
+ ret = snprintf(safe_tname, sizeof(safe_tname), "%s__safe_fields", tname);
+ if (ret < 0)
+ return false;
+
+ safe_id = btf_find_by_name_kind(btf, safe_tname, BTF_INFO_KIND(walk_type->info));
+ if (safe_id < 0)
+ return false;
+
+ safe_type = btf_type_by_id(btf, safe_id);
+ if (!safe_type)
+ return false;
+
+ for_each_member(i, walk_type, member) {
+ u32 moff;
+
+ /* We're looking for the PTR_TO_BTF_ID member in the struct
+ * type we're walking which matches the specified offset.
+ * Below, we'll iterate over the fields in the safe variant of
+ * the struct and see if any of them has a matching type /
+ * name.
+ */
+ moff = __btf_member_bit_offset(walk_type, member) / 8;
+ if (off == moff) {
+ m_walk = member;
+ break;
+ }
+ }
+ if (m_walk == NULL)
+ return false;
+
+ walk_name = __btf_name_by_offset(btf, m_walk->name_off);
+ for_each_member(i, safe_type, member) {
+ const char *m_name = __btf_name_by_offset(btf, member->name_off);
+
+ /* If we match on both type and name, the field is considered trusted. */
+ if (m_walk->type == member->type && !strcmp(walk_name, m_name))
+ return true;
+ }
+
+ return false;
+}
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index ca7db2ce70b9..7f973847b58e 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -4755,6 +4755,25 @@ static int bpf_map_direct_read(struct bpf_map *map, int off, int size, u64 *val)
return 0;
}
+#define BTF_TYPE_SAFE_NESTED(__type) __PASTE(__type, __safe_fields)
+
+BTF_TYPE_SAFE_NESTED(struct task_struct) {
+ const cpumask_t *cpus_ptr;
+};
+
+static bool nested_ptr_is_trusted(struct bpf_verifier_env *env,
+ struct bpf_reg_state *reg,
+ int off)
+{
+ /* If its parent is not trusted, it can't regain its trusted status. */
+ if (!is_trusted_reg(reg))
+ return false;
+
+ BTF_TYPE_EMIT(BTF_TYPE_SAFE_NESTED(struct task_struct));
+
+ return btf_nested_type_is_trusted(&env->log, reg, off);
+}
+
static int check_ptr_to_btf_access(struct bpf_verifier_env *env,
struct bpf_reg_state *regs,
int regno, int off, int size,
@@ -4843,10 +4862,17 @@ static int check_ptr_to_btf_access(struct bpf_verifier_env *env,
if (type_flag(reg->type) & PTR_UNTRUSTED)
flag |= PTR_UNTRUSTED;
- /* By default any pointer obtained from walking a trusted pointer is
- * no longer trusted except the rcu case below.
+ /* By default any pointer obtained from walking a trusted pointer is no
+ * longer trusted, unless the field being accessed has explicitly been
+ * marked as inheriting its parent's state of trust.
+ *
+ * An RCU-protected pointer can also be deemed trusted if we are in an
+ * RCU read region. This case is handled below.
*/
- flag &= ~PTR_TRUSTED;
+ if (nested_ptr_is_trusted(env, reg, off))
+ flag |= PTR_TRUSTED;
+ else
+ flag &= ~PTR_TRUSTED;
if (flag & MEM_RCU) {
/* Mark value register as MEM_RCU only if it is protected by
--
2.39.0
Certain programs may wish to be able to query cpumasks. For example, if
a program that is tracing percpu operations may wish to track which
tasks end up running on which CPUs, and it could be useful to associate
that with the tasks' cpumasks. Similarly, a program tracking NUMA
allocations, CPU scheduling domains, etc, would potentially benefit from
being able to see which CPUs a task could be migrated to, etc.
This patch enables these such cases by introducing a series of
bpf_cpumask_* kfuncs. Amongst these kfuncs, there are two separate
"classes" of operations:
1. kfuncs which allow the caller to allocate and mutate their own
cpumasks in the form of a struct bpf_cpumask * object. Such kfuncs
include e.g. bpf_cpumask_create() to allocate the cpumask, and
bpf_cpumask_or() to mutate it. "Regular" cpumasks such as p->cpus_ptr
may not be passed to these kfuncs, and the verifier will ensure this
is the case by comparing BTF IDs.
2. Read-only operations which operate on const struct cpumask *
arguments. For example, bpf_cpumask_test_cpu(), which tests whether a
CPU is set in the cpumask. Any trusted struct cpumask * or struct
bpf_cpumask * may be passed to these kfuncs. The verifier allows
struct bpf_cpumask * even though the kfunc is defined with struct
cpumask * because the first element of a struct bpf_cpumask is a
cpumask_t, so it is safe to cast.
A follow-on patch will add selftests which validate these kfuncs, and
another will document them.
Note that some of the kfuncs that were added would benefit from
additional verification logic. For example, any kfunc taking a CPU
argument that exceeds the number of CPUs on the system, etc. For now, we
silently check for and ignore these cases at runtime. When we have e.g.
per-argument kfunc flags, it might be helpful to add another KF_CPU-type
flag that specifies that the verifier should validate that it's a valid
CPU.
Signed-off-by: David Vernet <[email protected]>
---
kernel/bpf/Makefile | 1 +
kernel/bpf/cpumask.c | 263 +++++++++++++++++++++++++++++++++++++++++++
2 files changed, 264 insertions(+)
create mode 100644 kernel/bpf/cpumask.c
diff --git a/kernel/bpf/Makefile b/kernel/bpf/Makefile
index 3a12e6b400a2..02242614dcc7 100644
--- a/kernel/bpf/Makefile
+++ b/kernel/bpf/Makefile
@@ -36,6 +36,7 @@ obj-$(CONFIG_DEBUG_INFO_BTF) += sysfs_btf.o
endif
ifeq ($(CONFIG_BPF_JIT),y)
obj-$(CONFIG_BPF_SYSCALL) += bpf_struct_ops.o
+obj-$(CONFIG_BPF_SYSCALL) += cpumask.o
obj-${CONFIG_BPF_LSM} += bpf_lsm.o
endif
obj-$(CONFIG_BPF_PRELOAD) += preload/
diff --git a/kernel/bpf/cpumask.c b/kernel/bpf/cpumask.c
new file mode 100644
index 000000000000..e1fa15a7e079
--- /dev/null
+++ b/kernel/bpf/cpumask.c
@@ -0,0 +1,263 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/* Copyright (c) 2023 Meta, Inc
+ */
+#include <linux/bpf.h>
+#include <linux/bpf_mem_alloc.h>
+#include <linux/btf.h>
+#include <linux/btf_ids.h>
+#include <linux/cpumask.h>
+
+/**
+ * struct bpf_cpumask - refcounted BPF cpumask wrapper structure
+ * @cpumask: The actual cpumask embedded in the struct.
+ * @usage: Object reference counter. When the refcount goes to 0, the
+ * memory is released back to the BPF allocator, which provides
+ * RCU safety.
+ *
+ * Note that we explicitly embed a cpumask_t rather than a cpumask_var_t. This
+ * is done to avoid confusing the verifier due to the typedef of cpumask_var_t
+ * changing depending on whether CONFIG_CPUMASK_OFFSTACK is defined or not. See
+ * the details in <linux/cpumask.h>. The consequence is that this structure is
+ * likely a bit larger than it needs to be when CONFIG_CPUMASK_OFFSTACK is
+ * defined due to embedding the whole NR_CPUS-size bitmap, but the extra memory
+ * overhead it's minimal. For the more typical case of CONFIG_CPUMASK_OFFSTACK
+ * not being defined, the structure is the same size regardless.
+ */
+struct bpf_cpumask {
+ cpumask_t cpumask;
+ refcount_t usage;
+};
+
+static struct bpf_mem_alloc bpf_cpumask_ma;
+
+static bool cpu_valid(u32 cpu)
+{
+ return cpu < nr_cpu_ids;
+}
+
+struct bpf_cpumask *bpf_cpumask_create(void)
+{
+ struct bpf_cpumask *cpumask;
+
+ cpumask = bpf_mem_alloc(&bpf_cpumask_ma, sizeof(*cpumask));
+ if (!cpumask)
+ return NULL;
+
+ memset(cpumask, 0, sizeof(*cpumask));
+ refcount_set(&cpumask->usage, 1);
+
+ return cpumask;
+}
+
+struct bpf_cpumask *bpf_cpumask_acquire(struct bpf_cpumask *cpumask)
+{
+ refcount_inc(&cpumask->usage);
+ return cpumask;
+}
+
+struct bpf_cpumask *bpf_cpumask_kptr_get(struct bpf_cpumask **cpumaskp)
+{
+ struct bpf_cpumask *cpumask;
+
+ /* The BPF memory allocator frees memory backing its caches in an RCU
+ * callback. Thus, we can safely use RCU to ensure that the cpumask is
+ * safe to read.
+ */
+ rcu_read_lock();
+
+ cpumask = READ_ONCE(*cpumaskp);
+ if (cpumask && !refcount_inc_not_zero(&cpumask->usage))
+ cpumask = NULL;
+
+ rcu_read_unlock();
+ return cpumask;
+}
+
+void bpf_cpumask_release(struct bpf_cpumask *cpumask)
+{
+ if (!cpumask)
+ return;
+
+ if (refcount_dec_and_test(&cpumask->usage)) {
+ migrate_disable();
+ bpf_mem_free(&bpf_cpumask_ma, cpumask);
+ migrate_enable();
+ }
+}
+
+u32 bpf_cpumask_first(const struct cpumask *cpumask)
+{
+ return cpumask_first(cpumask);
+}
+
+u32 bpf_cpumask_first_zero(const struct cpumask *cpumask)
+{
+ return cpumask_first_zero(cpumask);
+}
+
+void bpf_cpumask_set_cpu(u32 cpu, struct bpf_cpumask *cpumask)
+{
+ if (!cpu_valid(cpu))
+ return;
+
+ cpumask_set_cpu(cpu, (struct cpumask *)cpumask);
+}
+
+void bpf_cpumask_clear_cpu(u32 cpu, struct bpf_cpumask *cpumask)
+{
+ if (!cpu_valid(cpu))
+ return;
+
+ cpumask_clear_cpu(cpu, (struct cpumask *)cpumask);
+}
+
+bool bpf_cpumask_test_cpu(u32 cpu, const struct cpumask *cpumask)
+{
+ if (!cpu_valid(cpu))
+ return false;
+
+ return cpumask_test_cpu(cpu, (struct cpumask *)cpumask);
+}
+
+bool bpf_cpumask_test_and_set_cpu(u32 cpu, struct bpf_cpumask *cpumask)
+{
+ if (!cpu_valid(cpu))
+ return false;
+
+ return cpumask_test_and_set_cpu(cpu, (struct cpumask *)cpumask);
+}
+
+bool bpf_cpumask_test_and_clear_cpu(u32 cpu, struct bpf_cpumask *cpumask)
+{
+ if (!cpu_valid(cpu))
+ return false;
+
+ return cpumask_test_and_clear_cpu(cpu, (struct cpumask *)cpumask);
+}
+
+void bpf_cpumask_setall(struct bpf_cpumask *cpumask)
+{
+ cpumask_setall((struct cpumask *)cpumask);
+}
+
+void bpf_cpumask_clear(struct bpf_cpumask *cpumask)
+{
+ cpumask_clear((struct cpumask *)cpumask);
+}
+
+bool bpf_cpumask_and(struct bpf_cpumask *dst,
+ const struct cpumask *src1,
+ const struct cpumask *src2)
+{
+ return cpumask_and((struct cpumask *)dst, src1, src2);
+}
+
+void bpf_cpumask_or(struct bpf_cpumask *dst,
+ const struct cpumask *src1,
+ const struct cpumask *src2)
+{
+ cpumask_or((struct cpumask *)dst, src1, src2);
+}
+
+void bpf_cpumask_xor(struct bpf_cpumask *dst,
+ const struct cpumask *src1,
+ const struct cpumask *src2)
+{
+ cpumask_xor((struct cpumask *)dst, src1, src2);
+}
+
+bool bpf_cpumask_equal(const struct cpumask *src1, const struct cpumask *src2)
+{
+ return cpumask_equal(src1, src2);
+}
+
+bool bpf_cpumask_intersects(const struct cpumask *src1, const struct cpumask *src2)
+{
+ return cpumask_intersects(src1, src2);
+}
+
+bool bpf_cpumask_subset(const struct cpumask *src1, const struct cpumask *src2)
+{
+ return cpumask_subset(src1, src2);
+}
+
+bool bpf_cpumask_empty(const struct cpumask *cpumask)
+{
+ return cpumask_empty(cpumask);
+}
+
+bool bpf_cpumask_full(const struct cpumask *cpumask)
+{
+ return cpumask_full(cpumask);
+}
+
+void bpf_cpumask_copy(struct bpf_cpumask *dst, const struct cpumask *src)
+{
+ cpumask_copy((struct cpumask *)dst, src);
+}
+
+u32 bpf_cpumask_any(const struct cpumask *cpumask)
+{
+ return cpumask_any(cpumask);
+}
+
+u32 bpf_cpumask_any_and(const struct cpumask *src1, const struct cpumask *src2)
+{
+ return cpumask_any_and(src1, src2);
+}
+
+BTF_SET8_START(cpumask_kfunc_btf_ids)
+BTF_ID_FLAGS(func, bpf_cpumask_create, KF_ACQUIRE | KF_RET_NULL)
+BTF_ID_FLAGS(func, bpf_cpumask_release, KF_RELEASE | KF_TRUSTED_ARGS)
+BTF_ID_FLAGS(func, bpf_cpumask_acquire, KF_ACQUIRE | KF_TRUSTED_ARGS)
+BTF_ID_FLAGS(func, bpf_cpumask_kptr_get, KF_ACQUIRE | KF_KPTR_GET | KF_RET_NULL)
+BTF_ID_FLAGS(func, bpf_cpumask_first, KF_TRUSTED_ARGS)
+BTF_ID_FLAGS(func, bpf_cpumask_first_zero, KF_TRUSTED_ARGS)
+BTF_ID_FLAGS(func, bpf_cpumask_set_cpu, KF_TRUSTED_ARGS)
+BTF_ID_FLAGS(func, bpf_cpumask_clear_cpu, KF_TRUSTED_ARGS)
+BTF_ID_FLAGS(func, bpf_cpumask_test_cpu, KF_TRUSTED_ARGS)
+BTF_ID_FLAGS(func, bpf_cpumask_test_and_set_cpu, KF_TRUSTED_ARGS)
+BTF_ID_FLAGS(func, bpf_cpumask_test_and_clear_cpu, KF_TRUSTED_ARGS)
+BTF_ID_FLAGS(func, bpf_cpumask_setall, KF_TRUSTED_ARGS)
+BTF_ID_FLAGS(func, bpf_cpumask_clear, KF_TRUSTED_ARGS)
+BTF_ID_FLAGS(func, bpf_cpumask_and, KF_TRUSTED_ARGS)
+BTF_ID_FLAGS(func, bpf_cpumask_or, KF_TRUSTED_ARGS)
+BTF_ID_FLAGS(func, bpf_cpumask_xor, KF_TRUSTED_ARGS)
+BTF_ID_FLAGS(func, bpf_cpumask_equal, KF_TRUSTED_ARGS)
+BTF_ID_FLAGS(func, bpf_cpumask_intersects, KF_TRUSTED_ARGS)
+BTF_ID_FLAGS(func, bpf_cpumask_subset, KF_TRUSTED_ARGS)
+BTF_ID_FLAGS(func, bpf_cpumask_empty, KF_TRUSTED_ARGS)
+BTF_ID_FLAGS(func, bpf_cpumask_full, KF_TRUSTED_ARGS)
+BTF_ID_FLAGS(func, bpf_cpumask_copy, KF_TRUSTED_ARGS)
+BTF_ID_FLAGS(func, bpf_cpumask_any, KF_TRUSTED_ARGS)
+BTF_ID_FLAGS(func, bpf_cpumask_any_and, KF_TRUSTED_ARGS)
+BTF_SET8_END(cpumask_kfunc_btf_ids)
+
+static const struct btf_kfunc_id_set cpumask_kfunc_set = {
+ .owner = THIS_MODULE,
+ .set = &cpumask_kfunc_btf_ids,
+};
+
+BTF_ID_LIST(cpumask_dtor_ids)
+BTF_ID(struct, bpf_cpumask)
+BTF_ID(func, bpf_cpumask_release)
+
+static int __init cpumask_kfunc_init(void)
+{
+ int ret;
+ const struct btf_id_dtor_kfunc cpumask_dtors[] = {
+ {
+ .btf_id = cpumask_dtor_ids[0],
+ .kfunc_btf_id = cpumask_dtor_ids[1]
+ },
+ };
+
+ ret = bpf_mem_alloc_init(&bpf_cpumask_ma, 0, false);
+ ret = ret ?: register_btf_kfunc_id_set(BPF_PROG_TYPE_TRACING, &cpumask_kfunc_set);
+ ret = ret ?: register_btf_kfunc_id_set(BPF_PROG_TYPE_STRUCT_OPS, &cpumask_kfunc_set);
+ return ret ?: register_btf_id_dtor_kfuncs(cpumask_dtors,
+ ARRAY_SIZE(cpumask_dtors),
+ THIS_MODULE);
+}
+
+late_initcall(cpumask_kfunc_init);
--
2.39.0
Now that we've added a series of new cpumask kfuncs, we should document
them so users can easily use them. This patch adds a new cpumasks.rst
file to document them.
Signed-off-by: David Vernet <[email protected]>
---
Documentation/bpf/cpumasks.rst | 357 +++++++++++++++++++++++++++++++++
Documentation/bpf/index.rst | 1 +
Documentation/bpf/kfuncs.rst | 11 +
kernel/bpf/cpumask.c | 208 +++++++++++++++++++
4 files changed, 577 insertions(+)
create mode 100644 Documentation/bpf/cpumasks.rst
diff --git a/Documentation/bpf/cpumasks.rst b/Documentation/bpf/cpumasks.rst
new file mode 100644
index 000000000000..ae6238965c50
--- /dev/null
+++ b/Documentation/bpf/cpumasks.rst
@@ -0,0 +1,357 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+.. _cpumasks-header-label:
+
+==================
+BPF cpumask kfuncs
+==================
+
+1. Introduction
+===============
+
+``struct cpumask`` is a bitmap data structure in the kernel whose indices
+reflect the CPUs on the system. Commonly, cpumasks are used to track which CPUs
+a task is affinitized to, but they can also be used to e.g. track which cores
+are associated with a scheduling domain, which cores on a machine are idle,
+etc.
+
+BPF provides programs with a set of :ref:`kfuncs-header-label` that can be
+used to allocate, mutate, query, and free cpumasks.
+
+2. BPF cpumask objects
+======================
+
+There are two different types of cpumasks that can be used by BPF programs.
+
+2.1 ``struct bpf_cpumask *``
+----------------------------
+
+``struct bpf_cpumask *`` is a cpumask that is allocated by BPF, on behalf of a
+BPF program, and whose lifecycle is entirely controlled by BPF. These cpumasks
+are RCU-protected, can be mutated, can be used as kptrs, and can be safely cast
+to a ``struct cpumask *``.
+
+2.1.1 ``struct bpf_cpumask *`` lifecycle
+----------------------------------------
+
+A ``struct bpf_cpumask *`` is allocated, acquired, and released, using the
+following functions:
+
+.. kernel-doc:: kernel/bpf/cpumask.c
+ :identifiers: bpf_cpumask_create
+
+.. kernel-doc:: kernel/bpf/cpumask.c
+ :identifiers: bpf_cpumask_acquire
+
+.. kernel-doc:: kernel/bpf/cpumask.c
+ :identifiers: bpf_cpumask_release
+
+For example:
+
+.. code-block:: c
+
+ /**
+ * A trivial example tracepoint program that shows how to
+ * acquire and release a struct bpf_cpumask *.
+ */
+ SEC("tp_btf/task_newtask")
+ int BPF_PROG(task_acquire_release_example, struct task_struct *task, u64 clone_flags)
+ {
+ struct bpf_cpumask *cpumask, *acquired;
+
+ cpumask = bpf_cpumask_create();
+ if (!cpumask)
+ return 1;
+
+ acquired = bpf_cpumask_acquire(cpumask);
+ bpf_cpumask_release(cpumask);
+ bpf_cpumask_acquire(acquired);
+
+ return 0;
+ }
+
+----
+
+2.1.1 ``struct bpf_cpumask *`` as kptrs
+---------------------------------------
+
+As mentioned above, these ``struct bpf_cpumask *`` objects can also be stored
+in a map and used as kptrs. If a ``struct bpf_cpumask *`` is in a map, the
+reference can be removed from the map with bpf_kptr_xchg(), or
+opportunistically acquired with bpf_cpumask_kptr_get():
+
+
+.. kernel-doc:: kernel/bpf/cpumask.c
+ :identifiers: bpf_cpumask_kptr_get
+
+Here is an example of a ``struct bpf_cpumask *`` being retrieved from a map:
+
+.. code-block:: c
+
+ /* struct containing the struct bpf_cpumask kptr which is actually stored in the map. */
+ struct __bpf_cpumasks_kfunc_map_value {
+ struct bpf_cpumask __kptr_ref * bpf_cpumask;
+ };
+
+ /* The map containing struct __bpf_cpumasks_kfunc_map_value entries. */
+ struct {
+ __uint(type, BPF_MAP_TYPE_ARRAY);
+ __type(key, int);
+ __type(value, struct __bpf_cpumasks_kfunc_map_value);
+ __uint(max_entries, 1);
+ } __bpf_cpumasks_kfunc_map SEC(".maps");
+
+ /* ... */
+
+ /**
+ * A simple example tracepoint program showing how a
+ * struct bpf_cpumask * kptr that is stored in a map can
+ * be acquired using the bpf_cpumask_kptr_get() kfunc.
+ */
+ SEC("tp_btf/cgroup_mkdir")
+ int BPF_PROG(cgrp_ancestor_example, struct cgroup *cgrp, const char *path)
+ {
+ struct bpf_cpumask *kptr;
+ struct __bpf_cpumasks_kfunc_map_value *v;
+ u32 key = 0;
+
+ /* Assume a bpf_cpumask * kptr was previously stored in the map. */
+ v = bpf_map_lookup_elem(&__bpf_cpumasks_kfunc_map, &key);
+ if (!v)
+ return -ENOENT;
+
+ /* Acquire a reference to the bpf_cpumask * kptr that's already stored in the map. */
+ kptr = bpf_cpumask_kptr_get(&v->cpumask);
+ if (!kptr)
+ /* If no bpf_cpumask was present in the map, it's because
+ * we're racing with another CPU that removed it with
+ * bpf_kptr_xchg() between the bpf_map_lookup_elem()
+ * above, and our call to bpf_cpumask_kptr_get().
+ * bpf_cpumask_kptr_get() internally safely handles this
+ * race, and will return NULL if the cpumask is no longer
+ * present in the map by the time we invoke the kfunc.
+ */
+ return -EBUSY;
+
+ /* Free the reference we just took above. Note that the
+ * original struct bpf_cpumask * kptr is still in the map. It will
+ * be freed either at a later time if another context deletes
+ * it from the map, or automatically by the BPF subsystem if
+ * it's still present when the map is destroyed.
+ */
+ bpf_cpumask_release(kptr);
+
+ return 0;
+ }
+
+----
+
+2.2 ``struct cpumask``
+----------------------
+
+``struct cpumask`` is the object that actually contains the cpumask bitmap
+being queried, mutated, etc. A ``struct bpf_cpumask`` wraps a ``struct
+cpumask``, which is why it's safe to cast it as such (note however that it is
+**not** safe to cast a ``struct cpumask *`` to a ``struct bpf_cpumask *``, and
+the verifier will reject any program that tries to do so).
+
+As we'll see below, any kfunc that mutates its cpumask argument will take a
+``struct bpf_cpumask *`` as that argument. Any argument that simply queries the
+cpumask will instead take a ``struct cpumask *``.
+
+3. cpumask kfuncs
+=================
+
+Above, we described the kfuncs that can be used to allocate, acquire, release,
+etc a ``struct bpf_cpumask *``. This section of the document will describe the
+kfuncs for mutating and querying cpumasks.
+
+3.1 Mutating cpumasks
+---------------------
+
+Some cpumask kfuncs are "read-only" in that they don't mutate any of their
+arguments, whereas others mutate at least one argument (which means that the
+argument must be a ``struct bpf_cpumask *``, as described above).
+
+This section will describe all of the cpumask kfuncs which mutate at least one
+argument. :ref:`cpumasks-querying-label` below describes the read-only kfuncs.
+
+3.1.1 Setting and clearing CPUs
+-------------------------------
+
+bpf_cpumask_set_cpu() and bpf_cpumask_clear_cpu() can be used to set and clear
+a CPU in a ``struct bpf_cpumask`` respectively:
+
+.. kernel-doc:: kernel/bpf/cpumask.c
+ :identifiers: bpf_cpumask_set_cpu bpf_cpumask_clear_cpu
+
+These kfuncs are pretty straightforward, and can be used, for example, as
+follows:
+
+.. code-block:: c
+
+ /**
+ * A sample tracepoint showing how a cpumask can be queried.
+ */
+ SEC("tp_btf/task_newtask")
+ int BPF_PROG(test_set_clear_cpu, struct task_struct *task, u64 clone_flags)
+ {
+ struct bpf_cpumask *cpumask;
+
+ cpumask = bpf_cpumask_create();
+ if (!cpumask)
+ return -ENOMEM;
+
+ bpf_cpumask_set_cpu(0, cpumask);
+ if (!bpf_cpumask_test_cpu(0, cast(cpumask)))
+ /* Should never happen. */
+ goto release_exit;
+
+ bpf_cpumask_clear_cpu(0, cpumask);
+ if (bpf_cpumask_test_cpu(0, cast(cpumask)))
+ /* Should never happen. */
+ goto release_exit;
+
+ /* struct cpumask * pointers such as task->cpus_ptr can also be queried. */
+ if (bpf_cpumask_test_cpu(0, task->cpus_ptr))
+ bpf_printk("task %s can use CPU %d", task->comm, 0);
+
+ release_exit:
+ bpf_cpumask_release(cpumask);
+ return 0;
+ }
+
+----
+
+bpf_cpumask_test_and_set_cpu() and bpf_cpumask_test_and_clear_cpu() are
+analogous kfuncs that allow callers to atomically test and set (or clear) CPUs:
+
+.. kernel-doc:: kernel/bpf/cpumask.c
+ :identifiers: bpf_cpumask_test_and_set_cpu bpf_cpumask_test_and_clear_cpu
+
+----
+
+We can also set and clear entire ``struct bpf_cpumask *`` objects in one
+operation using bpf_cpumask_setall() and bpf_cpumask_clear():
+
+.. kernel-doc:: kernel/bpf/cpumask.c
+ :identifiers: bpf_cpumask_setall bpf_cpumask_clear
+
+3.1.2 Operations between cpumasks
+---------------------------------
+
+In addition to setting and clearing individual CPUs in a single cpumask,
+callers can also perform bitwise operations between multiple cpumasks using
+bpf_cpumask_and(), bpf_cpumask_or(), and bpf_cpumask_xor():
+
+.. kernel-doc:: kernel/bpf/cpumask.c
+ :identifiers: bpf_cpumask_and bpf_cpumask_or bpf_cpumask_xor
+
+The following is an example of how they may be used. Note that some of the
+kfuncs shown in this example will be covered in more detail below.
+
+.. code-block:: c
+
+ /**
+ * A sample tracepoint showing how a cpumask can be mutated using
+ bitwise operators (and queried).
+ */
+ SEC("tp_btf/task_newtask")
+ int BPF_PROG(test_set_clear_cpu, struct task_struct *task, u64 clone_flags)
+ {
+ struct bpf_cpumask *mask1, *mask2, *dst1, *dst2;
+ int ret = -EINVAL;
+
+ mask1 = bpf_cpumask_create();
+ if (!mask1)
+ return -ENOMEM;
+
+ mask2 = bpf_cpumask_create();
+ if (!mask2) {
+ bpf_cpumask_release(mask1);
+ return -ENOMEM;
+ }
+
+ // ...Safely create the other two masks... */
+
+ bpf_cpumask_set_cpu(0, mask1);
+ bpf_cpumask_set_cpu(1, mask2);
+ bpf_cpumask_and(dst1, (const struct cpumask *)mask1, (const struct cpumask *)mask2);
+ if (!bpf_cpumask_empty((const struct cpumask *)dst1))
+ /* Should never happen. */
+ goto release_exit;
+
+ bpf_cpumask_or(dst1, (const struct cpumask *)mask1, (const struct cpumask *)mask2);
+ if (!bpf_cpumask_test_cpu(0, (const struct cpumask *)dst1))
+ /* Should never happen. */
+ goto release_exit;
+
+ if (!bpf_cpumask_test_cpu(1, (const struct cpumask *)dst1))
+ /* Should never happen. */
+ goto release_exit;
+
+ bpf_cpumask_xor(dst2, (const struct cpumask *)mask1, (const struct cpumask *)mask2);
+ if (!bpf_cpumask_equal((const struct cpumask *)dst1,
+ (const struct cpumask *)dst2))
+ /* Should never happen. */
+ goto release_exit;
+
+ err = 0;
+
+ release_exit:
+ bpf_cpumask_release(mask1);
+ bpf_cpumask_release(mask2);
+ bpf_cpumask_release(dst1);
+ bpf_cpumask_release(dst2);
+ return 0;
+ }
+
+----
+
+The contents of an entire cpumask may be copied to another using
+bpf_cpumask_copy():
+
+.. kernel-doc:: kernel/bpf/cpumask.c
+ :identifiers: bpf_cpumask_copy
+
+----
+
+.. _cpumasks-querying-label:
+
+3.2 Querying cpumasks
+---------------------
+
+In addition to the above kfuncs, there is also a set of read-only kfuncs that
+can be used to query the contents of cpumasks.
+
+.. kernel-doc:: kernel/bpf/cpumask.c
+ :identifiers: bpf_cpumask_first bpf_cpumask_first_zero bpf_cpumask_test_cpu
+
+.. kernel-doc:: kernel/bpf/cpumask.c
+ :identifiers: bpf_cpumask_equal bpf_cpumask_intersects bpf_cpumask_subset
+ bpf_cpumask_empty bpf_cpumask_full
+
+.. kernel-doc:: kernel/bpf/cpumask.c
+ :identifiers: bpf_cpumask_any bpf_cpumask_any_and
+
+----
+
+Some example usages of these querying kfuncs were shown above. We will not
+replicate those exmaples here. Note, however, that all of the aforementioned
+kfuncs are tested in `tools/testing/selftests/bpf/progs/cpumask_success.c`_, so
+please take a look there if you're looking for more examples of how they can be
+used.
+
+.. _tools/testing/selftests/bpf/progs/cpumask_success.c:
+ https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/tree/tools/testing/selftests/bpf/progs/cpumask_success.c
+
+
+4. Adding BPF cpumask kfuncs
+============================
+
+The set of supported BPF cpumask kfuncs are not (yet) a 1-1 match with the
+cpumask operations in include/linux/cpumask.h. Any of those cpumask operations
+could easily be encapsulated in a new kfunc if and when required. If you'd like
+to support a new cpumask operation, please feel free to submit a patch. If you
+do add a new cpumask kfunc, please document it here, and add any relevant
+selftest testcases to the cpumask selftest suite.
diff --git a/Documentation/bpf/index.rst b/Documentation/bpf/index.rst
index b81533d8b061..dbb39e8f9889 100644
--- a/Documentation/bpf/index.rst
+++ b/Documentation/bpf/index.rst
@@ -20,6 +20,7 @@ that goes into great technical depth about the BPF Architecture.
syscall_api
helpers
kfuncs
+ cpumasks
programs
maps
bpf_prog_run
diff --git a/Documentation/bpf/kfuncs.rst b/Documentation/bpf/kfuncs.rst
index 9fd7fb539f85..a74f9e74087b 100644
--- a/Documentation/bpf/kfuncs.rst
+++ b/Documentation/bpf/kfuncs.rst
@@ -1,3 +1,7 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+.. _kfuncs-header-label:
+
=============================
BPF Kernel Functions (kfuncs)
=============================
@@ -420,3 +424,10 @@ the verifier. bpf_cgroup_ancestor() can be used as follows:
bpf_cgroup_release(parent);
return 0;
}
+
+3.3 struct cpumask * kfuncs
+---------------------------
+
+BPF provides a set of kfuncs that can be used to query, allocate, mutate, and
+destroy struct cpumask * objects. Please refer to :ref:`cpumasks-header-label`
+for more details.
diff --git a/kernel/bpf/cpumask.c b/kernel/bpf/cpumask.c
index e1fa15a7e079..91fdd07ee9fc 100644
--- a/kernel/bpf/cpumask.c
+++ b/kernel/bpf/cpumask.c
@@ -35,6 +35,16 @@ static bool cpu_valid(u32 cpu)
return cpu < nr_cpu_ids;
}
+/**
+ * bpf_cpumask_create() - Create a mutable BPF cpumask.
+ *
+ * Allocates a cpumask that can be queried, mutated, acquired, and released by
+ * a BPF program. The cpumask returned by this function must either be embedded
+ * in a map as a kptr, or freed with bpf_cpumask_release().
+ *
+ * bpf_cpumask_create() allocates memory using the BPF memory allocator, and
+ * will not block. It may return NULL if no memory is available.
+ */
struct bpf_cpumask *bpf_cpumask_create(void)
{
struct bpf_cpumask *cpumask;
@@ -49,12 +59,31 @@ struct bpf_cpumask *bpf_cpumask_create(void)
return cpumask;
}
+/**
+ * bpf_cpumask_acquire() - Acquire a reference to a BPF cpumask.
+ * @cpumask: The BPF cpumask being acquired. The cpumask must be a trusted
+ * pointer.
+ *
+ * Acquires a reference to a BPF cpumask. The cpumask returned by this function
+ * must either be embedded in a map as a kptr, or freed with
+ * bpf_cpumask_release().
+ */
struct bpf_cpumask *bpf_cpumask_acquire(struct bpf_cpumask *cpumask)
{
refcount_inc(&cpumask->usage);
return cpumask;
}
+/**
+ * bpf_cpumask_kptr_get() - Attempt to acquire a reference to a BPF cpumask
+ * stored in a map.
+ * @cpumaskp: A pointer to a BPF cpumask map value.
+ *
+ * Attempts to acquire a reference to a BPF cpumask stored in a map value. The
+ * cpumask returned by this function must either be embedded in a map as a
+ * kptr, or freed with bpf_cpumask_release(). This function may return NULL if
+ * no BPF cpumask was found in the specified map value.
+ */
struct bpf_cpumask *bpf_cpumask_kptr_get(struct bpf_cpumask **cpumaskp)
{
struct bpf_cpumask *cpumask;
@@ -73,6 +102,14 @@ struct bpf_cpumask *bpf_cpumask_kptr_get(struct bpf_cpumask **cpumaskp)
return cpumask;
}
+/**
+ * bpf_cpumask_release() - Release a previously acquired BPF cpumask.
+ * @cpumask: The cpumask being released.
+ *
+ * Releases a previously acquired reference to a BPF cpumask. When the final
+ * reference of the BPF cpumask has been released, it is subsequently freed in
+ * an RCU callback in the BPF memory allocator.
+ */
void bpf_cpumask_release(struct bpf_cpumask *cpumask)
{
if (!cpumask)
@@ -85,16 +122,36 @@ void bpf_cpumask_release(struct bpf_cpumask *cpumask)
}
}
+/**
+ * bpf_cpumask_first() - Get the index of the first nonzero bit in the cpumask.
+ * @cpumask: The cpumask being queried.
+ *
+ * Find the index of the first nonzero bit of the cpumask. A struct bpf_cpumask
+ * pointer may be safely passed to this function.
+ */
u32 bpf_cpumask_first(const struct cpumask *cpumask)
{
return cpumask_first(cpumask);
}
+/**
+ * bpf_cpumask_first_zero() - Get the index of the first unset bit in the
+ * cpumask.
+ * @cpumask: The cpumask being queried.
+ *
+ * Find the index of the first unset bit of the cpumask. A struct bpf_cpumask
+ * pointer may be safely passed to this function.
+ */
u32 bpf_cpumask_first_zero(const struct cpumask *cpumask)
{
return cpumask_first_zero(cpumask);
}
+/**
+ * bpf_cpumask_set_cpu() - Set a bit for a CPU in a BPF cpumask.
+ * @cpu: The CPU to be set in the cpumask.
+ * @cpumask: The BPF cpumask in which a bit is being set.
+ */
void bpf_cpumask_set_cpu(u32 cpu, struct bpf_cpumask *cpumask)
{
if (!cpu_valid(cpu))
@@ -103,6 +160,11 @@ void bpf_cpumask_set_cpu(u32 cpu, struct bpf_cpumask *cpumask)
cpumask_set_cpu(cpu, (struct cpumask *)cpumask);
}
+/**
+ * bpf_cpumask_clear_cpu() - Clear a bit for a CPU in a BPF cpumask.
+ * @cpu: The CPU to be cleared from the cpumask.
+ * @cpumask: The BPF cpumask in which a bit is being cleared.
+ */
void bpf_cpumask_clear_cpu(u32 cpu, struct bpf_cpumask *cpumask)
{
if (!cpu_valid(cpu))
@@ -111,6 +173,15 @@ void bpf_cpumask_clear_cpu(u32 cpu, struct bpf_cpumask *cpumask)
cpumask_clear_cpu(cpu, (struct cpumask *)cpumask);
}
+/**
+ * bpf_cpumask_test_cpu() - Test whether a CPU is set in a cpumask.
+ * @cpu: The CPU being queried for.
+ * @cpumask: The cpumask being queried for containing a CPU.
+ *
+ * Return:
+ * * true - @cpu is set in the cpumask
+ * * false - @cpu was not set in the cpumask, or @cpu is an invalid cpu.
+ */
bool bpf_cpumask_test_cpu(u32 cpu, const struct cpumask *cpumask)
{
if (!cpu_valid(cpu))
@@ -119,6 +190,15 @@ bool bpf_cpumask_test_cpu(u32 cpu, const struct cpumask *cpumask)
return cpumask_test_cpu(cpu, (struct cpumask *)cpumask);
}
+/**
+ * bpf_cpumask_test_and_set_cpu() - Atomically test and set a CPU in a BPF cpumask.
+ * @cpu: The CPU being set and queried for.
+ * @cpumask: The BPF cpumask being set and queried for containing a CPU.
+ *
+ * Return:
+ * * true - @cpu is set in the cpumask
+ * * false - @cpu was not set in the cpumask, or @cpu is invalid.
+ */
bool bpf_cpumask_test_and_set_cpu(u32 cpu, struct bpf_cpumask *cpumask)
{
if (!cpu_valid(cpu))
@@ -127,6 +207,16 @@ bool bpf_cpumask_test_and_set_cpu(u32 cpu, struct bpf_cpumask *cpumask)
return cpumask_test_and_set_cpu(cpu, (struct cpumask *)cpumask);
}
+/**
+ * bpf_cpumask_test_and_clear_cpu() - Atomically test and clear a CPU in a BPF
+ * cpumask.
+ * @cpu: The CPU being cleared and queried for.
+ * @cpumask: The BPF cpumask being cleared and queried for containing a CPU.
+ *
+ * Return:
+ * * true - @cpu is set in the cpumask
+ * * false - @cpu was not set in the cpumask, or @cpu is invalid.
+ */
bool bpf_cpumask_test_and_clear_cpu(u32 cpu, struct bpf_cpumask *cpumask)
{
if (!cpu_valid(cpu))
@@ -135,16 +225,36 @@ bool bpf_cpumask_test_and_clear_cpu(u32 cpu, struct bpf_cpumask *cpumask)
return cpumask_test_and_clear_cpu(cpu, (struct cpumask *)cpumask);
}
+/**
+ * bpf_cpumask_setall() - Set all of the bits in a BPF cpumask.
+ * @cpumask: The BPF cpumask having all of its bits set.
+ */
void bpf_cpumask_setall(struct bpf_cpumask *cpumask)
{
cpumask_setall((struct cpumask *)cpumask);
}
+/**
+ * bpf_cpumask_clear() - Clear all of the bits in a BPF cpumask.
+ * @cpumask: The BPF cpumask being cleared.
+ */
void bpf_cpumask_clear(struct bpf_cpumask *cpumask)
{
cpumask_clear((struct cpumask *)cpumask);
}
+/**
+ * bpf_cpumask_and() - AND two cpumasks and store the result.
+ * @dst: The BPF cpumask where the result is being stored.
+ * @src1: The first input.
+ * @src2: The second input.
+ *
+ * Return:
+ * * true - @dst has at least one bit set following the operation
+ * * false - @dst is empty following the operation
+ *
+ * struct bpf_cpumask pointers may be safely passed to @src1 and @src2.
+ */
bool bpf_cpumask_and(struct bpf_cpumask *dst,
const struct cpumask *src1,
const struct cpumask *src2)
@@ -152,6 +262,14 @@ bool bpf_cpumask_and(struct bpf_cpumask *dst,
return cpumask_and((struct cpumask *)dst, src1, src2);
}
+/**
+ * bpf_cpumask_or() - OR two cpumasks and store the result.
+ * @dst: The BPF cpumask where the result is being stored.
+ * @src1: The first input.
+ * @src2: The second input.
+ *
+ * struct bpf_cpumask pointers may be safely passed to @src1 and @src2.
+ */
void bpf_cpumask_or(struct bpf_cpumask *dst,
const struct cpumask *src1,
const struct cpumask *src2)
@@ -159,6 +277,14 @@ void bpf_cpumask_or(struct bpf_cpumask *dst,
cpumask_or((struct cpumask *)dst, src1, src2);
}
+/**
+ * bpf_cpumask_xor() - XOR two cpumasks and store the result.
+ * @dst: The BPF cpumask where the result is being stored.
+ * @src1: The first input.
+ * @src2: The second input.
+ *
+ * struct bpf_cpumask pointers may be safely passed to @src1 and @src2.
+ */
void bpf_cpumask_xor(struct bpf_cpumask *dst,
const struct cpumask *src1,
const struct cpumask *src2)
@@ -166,41 +292,123 @@ void bpf_cpumask_xor(struct bpf_cpumask *dst,
cpumask_xor((struct cpumask *)dst, src1, src2);
}
+/**
+ * bpf_cpumask_equal() - Check two cpumasks for equality.
+ * @src1: The first input.
+ * @src2: The second input.
+ *
+ * Return:
+ * * true - @src1 and @src2 have the same bits set.
+ * * false - @src1 and @src2 differ in at least one bit.
+ *
+ * struct bpf_cpumask pointers may be safely passed to @src1 and @src2.
+ */
bool bpf_cpumask_equal(const struct cpumask *src1, const struct cpumask *src2)
{
return cpumask_equal(src1, src2);
}
+/**
+ * bpf_cpumask_intersects() - Check two cpumasks for overlap.
+ * @src1: The first input.
+ * @src2: The second input.
+ *
+ * Return:
+ * * true - @src1 and @src2 have at least one of the same bits set.
+ * * false - @src1 and @src2 don't have any of the same bits set.
+ *
+ * struct bpf_cpumask pointers may be safely passed to @src1 and @src2.
+ */
bool bpf_cpumask_intersects(const struct cpumask *src1, const struct cpumask *src2)
{
return cpumask_intersects(src1, src2);
}
+/**
+ * bpf_cpumask_subset() - Check if a cpumask is a subset of another.
+ * @src1: The first cpumask being checked as a subset.
+ * @src2: The second cpumask being checked as a superset.
+ *
+ * Return:
+ * * true - All of the bits of @src1 are set in @src2.
+ * * false - At least one bit in @src1 is not set in @src2.
+ *
+ * struct bpf_cpumask pointers may be safely passed to @src1 and @src2.
+ */
bool bpf_cpumask_subset(const struct cpumask *src1, const struct cpumask *src2)
{
return cpumask_subset(src1, src2);
}
+/**
+ * bpf_cpumask_empty() - Check if a cpumask is empty.
+ * @cpumask: The cpumask being checked.
+ *
+ * Return:
+ * * true - None of the bits in @cpumask are set.
+ * * false - At least one bit in @cpumask is set.
+ *
+ * A struct bpf_cpumask pointer may be safely passed to @cpumask.
+ */
bool bpf_cpumask_empty(const struct cpumask *cpumask)
{
return cpumask_empty(cpumask);
}
+/**
+ * bpf_cpumask_full() - Check if a cpumask has all bits set.
+ * @cpumask: The cpumask being checked.
+ *
+ * Return:
+ * * true - All of the bits in @cpumask are set.
+ * * false - At least one bit in @cpumask is cleared.
+ *
+ * A struct bpf_cpumask pointer may be safely passed to @cpumask.
+ */
bool bpf_cpumask_full(const struct cpumask *cpumask)
{
return cpumask_full(cpumask);
}
+/**
+ * bpf_cpumask_copy() - Copy the contents of a cpumask into a BPF cpumask.
+ * @dst: The BPF cpumask being copied into.
+ * @src: The cpumask being copied.
+ *
+ * A struct bpf_cpumask pointer may be safely passed to @src.
+ */
void bpf_cpumask_copy(struct bpf_cpumask *dst, const struct cpumask *src)
{
cpumask_copy((struct cpumask *)dst, src);
}
+/**
+ * bpf_cpumask_any() - Return a random set CPU from a cpumask.
+ * @cpumask: The cpumask being queried.
+ *
+ * Return:
+ * * A random set bit within [0, num_cpus) if at least one bit is set.
+ * * >= num_cpus if no bit is set.
+ *
+ * A struct bpf_cpumask pointer may be safely passed to @src.
+ */
u32 bpf_cpumask_any(const struct cpumask *cpumask)
{
return cpumask_any(cpumask);
}
+/**
+ * bpf_cpumask_any_and() - Return a random set CPU from the AND of two
+ * cpumasks.
+ * @src1: The first cpumask.
+ * @src2: The second cpumask.
+ *
+ * Return:
+ * * A random set bit within [0, num_cpus) if at least one bit is set.
+ * * >= num_cpus if no bit is set.
+ *
+ * struct bpf_cpumask pointers may be safely passed to @src1 and @src2.
+ */
u32 bpf_cpumask_any_and(const struct cpumask *src1, const struct cpumask *src2)
{
return cpumask_any_and(src1, src2);
--
2.39.0
Hi David,
Thank you for the patch! Perhaps something to improve:
[auto build test WARNING on bpf-next/master]
url: https://github.com/intel-lab-lkp/linux/commits/David-Vernet/bpf-Enable-annotating-trusted-nested-pointers/20230120-080139
base: https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git master
patch link: https://lore.kernel.org/r/20230119235833.2948341-2-void%40manifault.com
patch subject: [PATCH bpf-next 1/8] bpf: Enable annotating trusted nested pointers
config: ia64-allyesconfig (https://download.01.org/0day-ci/archive/20230120/[email protected]/config)
compiler: ia64-linux-gcc (GCC) 12.1.0
reproduce (this is a W=1 build):
wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
chmod +x ~/bin/make.cross
# https://github.com/intel-lab-lkp/linux/commit/8f6df14342b1be3516f8e21037edf771df851427
git remote add linux-review https://github.com/intel-lab-lkp/linux
git fetch --no-tags linux-review David-Vernet/bpf-Enable-annotating-trusted-nested-pointers/20230120-080139
git checkout 8f6df14342b1be3516f8e21037edf771df851427
# save the config file
mkdir build_dir && cp config build_dir/.config
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-12.1.0 make.cross W=1 O=build_dir ARCH=ia64 olddefconfig
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-12.1.0 make.cross W=1 O=build_dir ARCH=ia64 SHELL=/bin/bash kernel/bpf/
If you fix the issue, kindly add following tag where applicable
| Reported-by: kernel test robot <[email protected]>
All warnings (new ones prefixed by >>):
>> kernel/bpf/btf.c:533:5: warning: no previous prototype for 'bpf_find_btf_id' [-Wmissing-prototypes]
533 | s32 bpf_find_btf_id(const char *name, u32 kind, struct btf **btf_p)
| ^~~~~~~~~~~~~~~
kernel/bpf/btf.c: In function 'btf_seq_show':
kernel/bpf/btf.c:6977:29: warning: function 'btf_seq_show' might be a candidate for 'gnu_printf' format attribute [-Wsuggest-attribute=format]
6977 | seq_vprintf((struct seq_file *)show->target, fmt, args);
| ^~~~~~~~
kernel/bpf/btf.c: In function 'btf_snprintf_show':
kernel/bpf/btf.c:7014:9: warning: function 'btf_snprintf_show' might be a candidate for 'gnu_printf' format attribute [-Wsuggest-attribute=format]
7014 | len = vsnprintf(show->target, ssnprintf->len_left, fmt, args);
| ^~~
vim +/bpf_find_btf_id +533 kernel/bpf/btf.c
532
> 533 s32 bpf_find_btf_id(const char *name, u32 kind, struct btf **btf_p)
534 {
535 struct btf *btf;
536 s32 ret;
537 int id;
538
539 btf = bpf_get_btf_vmlinux();
540 if (IS_ERR(btf))
541 return PTR_ERR(btf);
542 if (!btf)
543 return -EINVAL;
544
545 ret = btf_find_by_name_kind(btf, name, kind);
546 /* ret is never zero, since btf_find_by_name_kind returns
547 * positive btf_id or negative error.
548 */
549 if (ret > 0) {
550 btf_get(btf);
551 *btf_p = btf;
552 return ret;
553 }
554
555 /* If name is not found in vmlinux's BTF then search in module's BTFs */
556 spin_lock_bh(&btf_idr_lock);
557 idr_for_each_entry(&btf_idr, btf, id) {
558 if (!btf_is_module(btf))
559 continue;
560 /* linear search could be slow hence unlock/lock
561 * the IDR to avoiding holding it for too long
562 */
563 btf_get(btf);
564 spin_unlock_bh(&btf_idr_lock);
565 ret = btf_find_by_name_kind(btf, name, kind);
566 if (ret > 0) {
567 *btf_p = btf;
568 return ret;
569 }
570 spin_lock_bh(&btf_idr_lock);
571 btf_put(btf);
572 }
573 spin_unlock_bh(&btf_idr_lock);
574 return ret;
575 }
576
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests
Hi David,
Thank you for the patch! Perhaps something to improve:
[auto build test WARNING on bpf-next/master]
url: https://github.com/intel-lab-lkp/linux/commits/David-Vernet/bpf-Enable-annotating-trusted-nested-pointers/20230120-080139
base: https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git master
patch link: https://lore.kernel.org/r/20230119235833.2948341-5-void%40manifault.com
patch subject: [PATCH bpf-next 4/8] bpf: Enable cpumasks to be queried and used as kptrs
config: sparc-allyesconfig (https://download.01.org/0day-ci/archive/20230120/[email protected]/config)
compiler: sparc64-linux-gcc (GCC) 12.1.0
reproduce (this is a W=1 build):
wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
chmod +x ~/bin/make.cross
# https://github.com/intel-lab-lkp/linux/commit/11541205c58f2226e5ffbc5967317469d65efac6
git remote add linux-review https://github.com/intel-lab-lkp/linux
git fetch --no-tags linux-review David-Vernet/bpf-Enable-annotating-trusted-nested-pointers/20230120-080139
git checkout 11541205c58f2226e5ffbc5967317469d65efac6
# save the config file
mkdir build_dir && cp config build_dir/.config
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-12.1.0 make.cross W=1 O=build_dir ARCH=sparc olddefconfig
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-12.1.0 make.cross W=1 O=build_dir ARCH=sparc SHELL=/bin/bash kernel/bpf/
If you fix the issue, kindly add following tag where applicable
| Reported-by: kernel test robot <[email protected]>
All warnings (new ones prefixed by >>):
>> kernel/bpf/cpumask.c:38:21: warning: no previous prototype for 'bpf_cpumask_create' [-Wmissing-prototypes]
38 | struct bpf_cpumask *bpf_cpumask_create(void)
| ^~~~~~~~~~~~~~~~~~
>> kernel/bpf/cpumask.c:52:21: warning: no previous prototype for 'bpf_cpumask_acquire' [-Wmissing-prototypes]
52 | struct bpf_cpumask *bpf_cpumask_acquire(struct bpf_cpumask *cpumask)
| ^~~~~~~~~~~~~~~~~~~
>> kernel/bpf/cpumask.c:58:21: warning: no previous prototype for 'bpf_cpumask_kptr_get' [-Wmissing-prototypes]
58 | struct bpf_cpumask *bpf_cpumask_kptr_get(struct bpf_cpumask **cpumaskp)
| ^~~~~~~~~~~~~~~~~~~~
>> kernel/bpf/cpumask.c:76:6: warning: no previous prototype for 'bpf_cpumask_release' [-Wmissing-prototypes]
76 | void bpf_cpumask_release(struct bpf_cpumask *cpumask)
| ^~~~~~~~~~~~~~~~~~~
>> kernel/bpf/cpumask.c:88:5: warning: no previous prototype for 'bpf_cpumask_first' [-Wmissing-prototypes]
88 | u32 bpf_cpumask_first(const struct cpumask *cpumask)
| ^~~~~~~~~~~~~~~~~
>> kernel/bpf/cpumask.c:93:5: warning: no previous prototype for 'bpf_cpumask_first_zero' [-Wmissing-prototypes]
93 | u32 bpf_cpumask_first_zero(const struct cpumask *cpumask)
| ^~~~~~~~~~~~~~~~~~~~~~
>> kernel/bpf/cpumask.c:98:6: warning: no previous prototype for 'bpf_cpumask_set_cpu' [-Wmissing-prototypes]
98 | void bpf_cpumask_set_cpu(u32 cpu, struct bpf_cpumask *cpumask)
| ^~~~~~~~~~~~~~~~~~~
>> kernel/bpf/cpumask.c:106:6: warning: no previous prototype for 'bpf_cpumask_clear_cpu' [-Wmissing-prototypes]
106 | void bpf_cpumask_clear_cpu(u32 cpu, struct bpf_cpumask *cpumask)
| ^~~~~~~~~~~~~~~~~~~~~
>> kernel/bpf/cpumask.c:114:6: warning: no previous prototype for 'bpf_cpumask_test_cpu' [-Wmissing-prototypes]
114 | bool bpf_cpumask_test_cpu(u32 cpu, const struct cpumask *cpumask)
| ^~~~~~~~~~~~~~~~~~~~
>> kernel/bpf/cpumask.c:122:6: warning: no previous prototype for 'bpf_cpumask_test_and_set_cpu' [-Wmissing-prototypes]
122 | bool bpf_cpumask_test_and_set_cpu(u32 cpu, struct bpf_cpumask *cpumask)
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~
>> kernel/bpf/cpumask.c:130:6: warning: no previous prototype for 'bpf_cpumask_test_and_clear_cpu' [-Wmissing-prototypes]
130 | bool bpf_cpumask_test_and_clear_cpu(u32 cpu, struct bpf_cpumask *cpumask)
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>> kernel/bpf/cpumask.c:138:6: warning: no previous prototype for 'bpf_cpumask_setall' [-Wmissing-prototypes]
138 | void bpf_cpumask_setall(struct bpf_cpumask *cpumask)
| ^~~~~~~~~~~~~~~~~~
>> kernel/bpf/cpumask.c:143:6: warning: no previous prototype for 'bpf_cpumask_clear' [-Wmissing-prototypes]
143 | void bpf_cpumask_clear(struct bpf_cpumask *cpumask)
| ^~~~~~~~~~~~~~~~~
>> kernel/bpf/cpumask.c:148:6: warning: no previous prototype for 'bpf_cpumask_and' [-Wmissing-prototypes]
148 | bool bpf_cpumask_and(struct bpf_cpumask *dst,
| ^~~~~~~~~~~~~~~
>> kernel/bpf/cpumask.c:155:6: warning: no previous prototype for 'bpf_cpumask_or' [-Wmissing-prototypes]
155 | void bpf_cpumask_or(struct bpf_cpumask *dst,
| ^~~~~~~~~~~~~~
>> kernel/bpf/cpumask.c:162:6: warning: no previous prototype for 'bpf_cpumask_xor' [-Wmissing-prototypes]
162 | void bpf_cpumask_xor(struct bpf_cpumask *dst,
| ^~~~~~~~~~~~~~~
>> kernel/bpf/cpumask.c:169:6: warning: no previous prototype for 'bpf_cpumask_equal' [-Wmissing-prototypes]
169 | bool bpf_cpumask_equal(const struct cpumask *src1, const struct cpumask *src2)
| ^~~~~~~~~~~~~~~~~
>> kernel/bpf/cpumask.c:174:6: warning: no previous prototype for 'bpf_cpumask_intersects' [-Wmissing-prototypes]
174 | bool bpf_cpumask_intersects(const struct cpumask *src1, const struct cpumask *src2)
| ^~~~~~~~~~~~~~~~~~~~~~
>> kernel/bpf/cpumask.c:179:6: warning: no previous prototype for 'bpf_cpumask_subset' [-Wmissing-prototypes]
179 | bool bpf_cpumask_subset(const struct cpumask *src1, const struct cpumask *src2)
| ^~~~~~~~~~~~~~~~~~
>> kernel/bpf/cpumask.c:184:6: warning: no previous prototype for 'bpf_cpumask_empty' [-Wmissing-prototypes]
184 | bool bpf_cpumask_empty(const struct cpumask *cpumask)
| ^~~~~~~~~~~~~~~~~
kernel/bpf/cpumask.c:189:6: warning: no previous prototype for 'bpf_cpumask_full' [-Wmissing-prototypes]
189 | bool bpf_cpumask_full(const struct cpumask *cpumask)
| ^~~~~~~~~~~~~~~~
kernel/bpf/cpumask.c:194:6: warning: no previous prototype for 'bpf_cpumask_copy' [-Wmissing-prototypes]
194 | void bpf_cpumask_copy(struct bpf_cpumask *dst, const struct cpumask *src)
| ^~~~~~~~~~~~~~~~
kernel/bpf/cpumask.c:199:5: warning: no previous prototype for 'bpf_cpumask_any' [-Wmissing-prototypes]
199 | u32 bpf_cpumask_any(const struct cpumask *cpumask)
| ^~~~~~~~~~~~~~~
kernel/bpf/cpumask.c:204:5: warning: no previous prototype for 'bpf_cpumask_any_and' [-Wmissing-prototypes]
204 | u32 bpf_cpumask_any_and(const struct cpumask *src1, const struct cpumask *src2)
| ^~~~~~~~~~~~~~~~~~~
vim +/bpf_cpumask_create +38 kernel/bpf/cpumask.c
37
> 38 struct bpf_cpumask *bpf_cpumask_create(void)
39 {
40 struct bpf_cpumask *cpumask;
41
42 cpumask = bpf_mem_alloc(&bpf_cpumask_ma, sizeof(*cpumask));
43 if (!cpumask)
44 return NULL;
45
46 memset(cpumask, 0, sizeof(*cpumask));
47 refcount_set(&cpumask->usage, 1);
48
49 return cpumask;
50 }
51
> 52 struct bpf_cpumask *bpf_cpumask_acquire(struct bpf_cpumask *cpumask)
53 {
54 refcount_inc(&cpumask->usage);
55 return cpumask;
56 }
57
> 58 struct bpf_cpumask *bpf_cpumask_kptr_get(struct bpf_cpumask **cpumaskp)
59 {
60 struct bpf_cpumask *cpumask;
61
62 /* The BPF memory allocator frees memory backing its caches in an RCU
63 * callback. Thus, we can safely use RCU to ensure that the cpumask is
64 * safe to read.
65 */
66 rcu_read_lock();
67
68 cpumask = READ_ONCE(*cpumaskp);
69 if (cpumask && !refcount_inc_not_zero(&cpumask->usage))
70 cpumask = NULL;
71
72 rcu_read_unlock();
73 return cpumask;
74 }
75
> 76 void bpf_cpumask_release(struct bpf_cpumask *cpumask)
77 {
78 if (!cpumask)
79 return;
80
81 if (refcount_dec_and_test(&cpumask->usage)) {
82 migrate_disable();
83 bpf_mem_free(&bpf_cpumask_ma, cpumask);
84 migrate_enable();
85 }
86 }
87
> 88 u32 bpf_cpumask_first(const struct cpumask *cpumask)
89 {
90 return cpumask_first(cpumask);
91 }
92
> 93 u32 bpf_cpumask_first_zero(const struct cpumask *cpumask)
94 {
95 return cpumask_first_zero(cpumask);
96 }
97
> 98 void bpf_cpumask_set_cpu(u32 cpu, struct bpf_cpumask *cpumask)
99 {
100 if (!cpu_valid(cpu))
101 return;
102
103 cpumask_set_cpu(cpu, (struct cpumask *)cpumask);
104 }
105
> 106 void bpf_cpumask_clear_cpu(u32 cpu, struct bpf_cpumask *cpumask)
107 {
108 if (!cpu_valid(cpu))
109 return;
110
111 cpumask_clear_cpu(cpu, (struct cpumask *)cpumask);
112 }
113
> 114 bool bpf_cpumask_test_cpu(u32 cpu, const struct cpumask *cpumask)
115 {
116 if (!cpu_valid(cpu))
117 return false;
118
119 return cpumask_test_cpu(cpu, (struct cpumask *)cpumask);
120 }
121
> 122 bool bpf_cpumask_test_and_set_cpu(u32 cpu, struct bpf_cpumask *cpumask)
123 {
124 if (!cpu_valid(cpu))
125 return false;
126
127 return cpumask_test_and_set_cpu(cpu, (struct cpumask *)cpumask);
128 }
129
> 130 bool bpf_cpumask_test_and_clear_cpu(u32 cpu, struct bpf_cpumask *cpumask)
131 {
132 if (!cpu_valid(cpu))
133 return false;
134
135 return cpumask_test_and_clear_cpu(cpu, (struct cpumask *)cpumask);
136 }
137
> 138 void bpf_cpumask_setall(struct bpf_cpumask *cpumask)
139 {
140 cpumask_setall((struct cpumask *)cpumask);
141 }
142
> 143 void bpf_cpumask_clear(struct bpf_cpumask *cpumask)
144 {
145 cpumask_clear((struct cpumask *)cpumask);
146 }
147
> 148 bool bpf_cpumask_and(struct bpf_cpumask *dst,
149 const struct cpumask *src1,
150 const struct cpumask *src2)
151 {
152 return cpumask_and((struct cpumask *)dst, src1, src2);
153 }
154
> 155 void bpf_cpumask_or(struct bpf_cpumask *dst,
156 const struct cpumask *src1,
157 const struct cpumask *src2)
158 {
159 cpumask_or((struct cpumask *)dst, src1, src2);
160 }
161
> 162 void bpf_cpumask_xor(struct bpf_cpumask *dst,
163 const struct cpumask *src1,
164 const struct cpumask *src2)
165 {
166 cpumask_xor((struct cpumask *)dst, src1, src2);
167 }
168
> 169 bool bpf_cpumask_equal(const struct cpumask *src1, const struct cpumask *src2)
170 {
171 return cpumask_equal(src1, src2);
172 }
173
> 174 bool bpf_cpumask_intersects(const struct cpumask *src1, const struct cpumask *src2)
175 {
176 return cpumask_intersects(src1, src2);
177 }
178
> 179 bool bpf_cpumask_subset(const struct cpumask *src1, const struct cpumask *src2)
180 {
181 return cpumask_subset(src1, src2);
182 }
183
> 184 bool bpf_cpumask_empty(const struct cpumask *cpumask)
185 {
186 return cpumask_empty(cpumask);
187 }
188
> 189 bool bpf_cpumask_full(const struct cpumask *cpumask)
190 {
191 return cpumask_full(cpumask);
192 }
193
> 194 void bpf_cpumask_copy(struct bpf_cpumask *dst, const struct cpumask *src)
195 {
196 cpumask_copy((struct cpumask *)dst, src);
197 }
198
> 199 u32 bpf_cpumask_any(const struct cpumask *cpumask)
200 {
201 return cpumask_any(cpumask);
202 }
203
> 204 u32 bpf_cpumask_any_and(const struct cpumask *src1, const struct cpumask *src2)
205 {
206 return cpumask_any_and(src1, src2);
207 }
208
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests
On Fri, Jan 20, 2023 at 09:14:25AM +0800, kernel test robot wrote:
> Hi David,
>
> Thank you for the patch! Perhaps something to improve:
>
> [auto build test WARNING on bpf-next/master]
>
> url: https://github.com/intel-lab-lkp/linux/commits/David-Vernet/bpf-Enable-annotating-trusted-nested-pointers/20230120-080139
> base: https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git master
> patch link: https://lore.kernel.org/r/20230119235833.2948341-2-void%40manifault.com
> patch subject: [PATCH bpf-next 1/8] bpf: Enable annotating trusted nested pointers
> config: ia64-allyesconfig (https://download.01.org/0day-ci/archive/20230120/[email protected]/config)
> compiler: ia64-linux-gcc (GCC) 12.1.0
> reproduce (this is a W=1 build):
> wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
> chmod +x ~/bin/make.cross
> # https://github.com/intel-lab-lkp/linux/commit/8f6df14342b1be3516f8e21037edf771df851427
> git remote add linux-review https://github.com/intel-lab-lkp/linux
> git fetch --no-tags linux-review David-Vernet/bpf-Enable-annotating-trusted-nested-pointers/20230120-080139
> git checkout 8f6df14342b1be3516f8e21037edf771df851427
> # save the config file
> mkdir build_dir && cp config build_dir/.config
> COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-12.1.0 make.cross W=1 O=build_dir ARCH=ia64 olddefconfig
> COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-12.1.0 make.cross W=1 O=build_dir ARCH=ia64 SHELL=/bin/bash kernel/bpf/
>
> If you fix the issue, kindly add following tag where applicable
> | Reported-by: kernel test robot <[email protected]>
>
> All warnings (new ones prefixed by >>):
>
> >> kernel/bpf/btf.c:533:5: warning: no previous prototype for 'bpf_find_btf_id' [-Wmissing-prototypes]
> 533 | s32 bpf_find_btf_id(const char *name, u32 kind, struct btf **btf_p)
Silly mistake on my part. I removed static while debugging something in
verifier.c and forgot to put it back. I'll put it back in v2.
> | ^~~~~~~~~~~~~~~
> kernel/bpf/btf.c: In function 'btf_seq_show':
> kernel/bpf/btf.c:6977:29: warning: function 'btf_seq_show' might be a candidate for 'gnu_printf' format attribute [-Wsuggest-attribute=format]
> 6977 | seq_vprintf((struct seq_file *)show->target, fmt, args);
> | ^~~~~~~~
> kernel/bpf/btf.c: In function 'btf_snprintf_show':
> kernel/bpf/btf.c:7014:9: warning: function 'btf_snprintf_show' might be a candidate for 'gnu_printf' format attribute [-Wsuggest-attribute=format]
> 7014 | len = vsnprintf(show->target, ssnprintf->len_left, fmt, args);
> | ^~~
>
>
> vim +/bpf_find_btf_id +533 kernel/bpf/btf.c
>
> 532
> > 533 s32 bpf_find_btf_id(const char *name, u32 kind, struct btf **btf_p)
> 534 {
> 535 struct btf *btf;
> 536 s32 ret;
> 537 int id;
> 538
> 539 btf = bpf_get_btf_vmlinux();
> 540 if (IS_ERR(btf))
> 541 return PTR_ERR(btf);
> 542 if (!btf)
> 543 return -EINVAL;
> 544
> 545 ret = btf_find_by_name_kind(btf, name, kind);
> 546 /* ret is never zero, since btf_find_by_name_kind returns
> 547 * positive btf_id or negative error.
> 548 */
> 549 if (ret > 0) {
> 550 btf_get(btf);
> 551 *btf_p = btf;
> 552 return ret;
> 553 }
> 554
> 555 /* If name is not found in vmlinux's BTF then search in module's BTFs */
> 556 spin_lock_bh(&btf_idr_lock);
> 557 idr_for_each_entry(&btf_idr, btf, id) {
> 558 if (!btf_is_module(btf))
> 559 continue;
> 560 /* linear search could be slow hence unlock/lock
> 561 * the IDR to avoiding holding it for too long
> 562 */
> 563 btf_get(btf);
> 564 spin_unlock_bh(&btf_idr_lock);
> 565 ret = btf_find_by_name_kind(btf, name, kind);
> 566 if (ret > 0) {
> 567 *btf_p = btf;
> 568 return ret;
> 569 }
> 570 spin_lock_bh(&btf_idr_lock);
> 571 btf_put(btf);
> 572 }
> 573 spin_unlock_bh(&btf_idr_lock);
> 574 return ret;
> 575 }
> 576
>
> --
> 0-DAY CI Kernel Test Service
> https://github.com/intel/lkp-tests
On Fri, Jan 20, 2023 at 10:36:24AM +0800, kernel test robot wrote:
> Hi David,
>
> Thank you for the patch! Perhaps something to improve:
>
> [auto build test WARNING on bpf-next/master]
>
> url: https://github.com/intel-lab-lkp/linux/commits/David-Vernet/bpf-Enable-annotating-trusted-nested-pointers/20230120-080139
> base: https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git master
> patch link: https://lore.kernel.org/r/20230119235833.2948341-5-void%40manifault.com
> patch subject: [PATCH bpf-next 4/8] bpf: Enable cpumasks to be queried and used as kptrs
> config: sparc-allyesconfig (https://download.01.org/0day-ci/archive/20230120/[email protected]/config)
> compiler: sparc64-linux-gcc (GCC) 12.1.0
> reproduce (this is a W=1 build):
> wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
> chmod +x ~/bin/make.cross
> # https://github.com/intel-lab-lkp/linux/commit/11541205c58f2226e5ffbc5967317469d65efac6
> git remote add linux-review https://github.com/intel-lab-lkp/linux
> git fetch --no-tags linux-review David-Vernet/bpf-Enable-annotating-trusted-nested-pointers/20230120-080139
> git checkout 11541205c58f2226e5ffbc5967317469d65efac6
> # save the config file
> mkdir build_dir && cp config build_dir/.config
> COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-12.1.0 make.cross W=1 O=build_dir ARCH=sparc olddefconfig
> COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-12.1.0 make.cross W=1 O=build_dir ARCH=sparc SHELL=/bin/bash kernel/bpf/
>
> If you fix the issue, kindly add following tag where applicable
> | Reported-by: kernel test robot <[email protected]>
>
> All warnings (new ones prefixed by >>):
>
> >> kernel/bpf/cpumask.c:38:21: warning: no previous prototype for 'bpf_cpumask_create' [-Wmissing-prototypes]
> 38 | struct bpf_cpumask *bpf_cpumask_create(void)
Sorry, also forgot to do the __diag() dance described in [0]. I'll
include that fix in v2.
[0]: https://docs.kernel.org/bpf/kfuncs.html#creating-a-wrapper-kfunc
This is also reminding me to send out the v2 for [1]. I'll do that
after I take care of another few small things I've been putting off.
[1]: https://lore.kernel.org/lkml/[email protected]/T/
Thanks,
David
On Fri, Jan 20, 2023 at 10:28:15AM +0530, Kumar Kartikeya Dwivedi wrote:
> On Fri, Jan 20, 2023 at 05:28:27AM IST, David Vernet wrote:
> > When validating BTF types for KF_TRUSTED_ARGS kfuncs, the verifier
> > currently enforces that the top-level type must match when calling
> > the kfunc. In other words, the verifier does not allow the BPF program
> > to pass a bitwise equivalent struct, despite it being functionally safe.
> > For example, if you have the following type:
> >
> > struct nf_conn___init {
> > struct nf_conn ct;
> > };
> >
> > It would be safe to pass a struct nf_conn___init to a kfunc expecting a
> > struct nf_conn.
>
> Just running bpf_nf selftest would have shown this is false.
And I feel silly, because I did run them, and could have sworn they
passed...looking now at the change_status_after_alloc testcase I see
you're of course correct. Very poor example, thank you for pointing it
out.
>
> > Being able to do this will be useful for certain types
> > of kfunc / kptrs enabled by BPF. For example, in a follow-on patch, a
> > series of kfuncs will be added which allow programs to do bitwise
> > queries on cpumasks that are either allocated by the program (in which
> > case they'll be a 'struct bpf_cpumask' type that wraps a cpumask_t as
> > its first element), or a cpumask that was allocated by the main kernel
> > (in which case it will just be a straight cpumask_t, as in
> > task->cpus_ptr).
> >
> > Having the two types of cpumasks allows us to distinguish between the
> > two for when a cpumask is read-only vs. mutatable. A struct bpf_cpumask
> > can be mutated by e.g. bpf_cpumask_clear(), whereas a regular cpumask_t
> > cannot be. On the other hand, a struct bpf_cpumask can of course be
> > queried in the exact same manner as a cpumask_t, with e.g.
> > bpf_cpumask_test_cpu().
> >
> > If we were to enforce that top level types match, then a user that's
> > passing a struct bpf_cpumask to a read-only cpumask_t argument would
> > have to cast with something like bpf_cast_to_kern_ctx() (which itself
> > would need to be updated to expect the alias, and currently it only
> > accommodates a single alias per prog type). Additionally, not specifying
> > KF_TRUSTED_ARGS is not an option, as some kfuncs take one argument as a
> > struct bpf_cpumask *, and another as a struct cpumask *
> > (i.e. cpumask_t).
> >
> > In order to enable this, this patch relaxes the constraint that a
> > KF_TRUSTED_ARGS kfunc must have strict type matching. In order to
> > try and be conservative and match existing behavior / expectations, this
> > patch also enforces strict type checking for acquire kfuncs. We were
> > already enforcing it for release kfuncs, so this should also improve the
> > consistency of the semantics for kfuncs.
> >
>
> What you want is to simply follow type at off = 0 (but still enforce the off = 0
> requirement). This is something which is currently done for bpf_sk_release (for
> struct sk_common) in check_reg_type, but it is not safe in general to just open
> this up for all cases. I suggest encoding this particular requirement in the
> argument, and simply using triple underscore variant of the type for the special
> 'read_only' requirement. This will allow you to use same type in your BPF C
> program, while allowing verifier to see them as two different types in kfunc
> parameters. Then just relax type following for the particular argument so that
> one can pass cpumask_t___ro to kfunc expecting cpumask_t (but only at off = 0,
> it just visits first member after failing match on top level type). off = 0
> check is still necessary.
Sigh, yeah, another ___ workaround but I agree it's probably the best we
can do for now, and in general seems pretty useful. Obviously preferable
to this patch which just doesn't work. Alexei, are you OK with this? If
so, I'll take this approach for v2.
>
> So offset checks still need to be according to OBJ_RELEASE but you can relax
> strict_type_match bool for the particular arg when calling btf_struct_ids_match.
>
> All code in your tests will then deal with a cpumask_t type only, including in
> kfunc declarations. Same as bpf_nf selftests which don't cast from/to
> nf_conn___init and only deal with nf_conn pointers even though semantics differ
> depending on how it is used and passed around. Overall more convenient and
> simple to use.
On Thu, Jan 19, 2023 at 11:23:18PM -0600, David Vernet wrote:
> On Fri, Jan 20, 2023 at 10:28:15AM +0530, Kumar Kartikeya Dwivedi wrote:
> > On Fri, Jan 20, 2023 at 05:28:27AM IST, David Vernet wrote:
> > > When validating BTF types for KF_TRUSTED_ARGS kfuncs, the verifier
> > > currently enforces that the top-level type must match when calling
> > > the kfunc. In other words, the verifier does not allow the BPF program
> > > to pass a bitwise equivalent struct, despite it being functionally safe.
> > > For example, if you have the following type:
> > >
> > > struct nf_conn___init {
> > > struct nf_conn ct;
> > > };
> > >
> > > It would be safe to pass a struct nf_conn___init to a kfunc expecting a
> > > struct nf_conn.
> >
> > Just running bpf_nf selftest would have shown this is false.
>
> And I feel silly, because I did run them, and could have sworn they
> passed...looking now at the change_status_after_alloc testcase I see
> you're of course correct. Very poor example, thank you for pointing it
> out.
>
> >
> > > Being able to do this will be useful for certain types
> > > of kfunc / kptrs enabled by BPF. For example, in a follow-on patch, a
> > > series of kfuncs will be added which allow programs to do bitwise
> > > queries on cpumasks that are either allocated by the program (in which
> > > case they'll be a 'struct bpf_cpumask' type that wraps a cpumask_t as
> > > its first element), or a cpumask that was allocated by the main kernel
> > > (in which case it will just be a straight cpumask_t, as in
> > > task->cpus_ptr).
> > >
> > > Having the two types of cpumasks allows us to distinguish between the
> > > two for when a cpumask is read-only vs. mutatable. A struct bpf_cpumask
> > > can be mutated by e.g. bpf_cpumask_clear(), whereas a regular cpumask_t
> > > cannot be. On the other hand, a struct bpf_cpumask can of course be
> > > queried in the exact same manner as a cpumask_t, with e.g.
> > > bpf_cpumask_test_cpu().
> > >
> > > If we were to enforce that top level types match, then a user that's
> > > passing a struct bpf_cpumask to a read-only cpumask_t argument would
> > > have to cast with something like bpf_cast_to_kern_ctx() (which itself
> > > would need to be updated to expect the alias, and currently it only
> > > accommodates a single alias per prog type). Additionally, not specifying
> > > KF_TRUSTED_ARGS is not an option, as some kfuncs take one argument as a
> > > struct bpf_cpumask *, and another as a struct cpumask *
> > > (i.e. cpumask_t).
> > >
> > > In order to enable this, this patch relaxes the constraint that a
> > > KF_TRUSTED_ARGS kfunc must have strict type matching. In order to
> > > try and be conservative and match existing behavior / expectations, this
> > > patch also enforces strict type checking for acquire kfuncs. We were
> > > already enforcing it for release kfuncs, so this should also improve the
> > > consistency of the semantics for kfuncs.
> > >
> >
> > What you want is to simply follow type at off = 0 (but still enforce the off = 0
> > requirement). This is something which is currently done for bpf_sk_release (for
> > struct sk_common) in check_reg_type, but it is not safe in general to just open
> > this up for all cases. I suggest encoding this particular requirement in the
> > argument, and simply using triple underscore variant of the type for the special
> > 'read_only' requirement. This will allow you to use same type in your BPF C
> > program, while allowing verifier to see them as two different types in kfunc
> > parameters. Then just relax type following for the particular argument so that
> > one can pass cpumask_t___ro to kfunc expecting cpumask_t (but only at off = 0,
> > it just visits first member after failing match on top level type). off = 0
> > check is still necessary.
>
> Sigh, yeah, another ___ workaround but I agree it's probably the best we
> can do for now, and in general seems pretty useful. Obviously preferable
> to this patch which just doesn't work. Alexei, are you OK with this? If
> so, I'll take this approach for v2.
We decided to rely on strict type match when we introduced 'struct nf_conn___init',
but with that we twisted the C standard to, what looks to be, a wrong direction.
For definition:
struct nf_conn___init {
struct nf_conn ct;
};
if a kfunc accepts a pointer to nf_conn it should always accept a pointer to nf_conn__init
for both read and write, because in C that's valid and safe type cast.
We can fix this design issue by saying that '___init' suffix is special and
C type casting rules don't apply to it.
In all other cases bpf_cpumask/cpumask would should allow it.
__ro suffix idea will keep moving us into further discrepancies with C.
On Fri, Jan 20, 2023 at 05:28:27AM IST, David Vernet wrote:
> When validating BTF types for KF_TRUSTED_ARGS kfuncs, the verifier
> currently enforces that the top-level type must match when calling
> the kfunc. In other words, the verifier does not allow the BPF program
> to pass a bitwise equivalent struct, despite it being functionally safe.
> For example, if you have the following type:
>
> struct nf_conn___init {
> struct nf_conn ct;
> };
>
> It would be safe to pass a struct nf_conn___init to a kfunc expecting a
> struct nf_conn.
Just running bpf_nf selftest would have shown this is false.
> Being able to do this will be useful for certain types
> of kfunc / kptrs enabled by BPF. For example, in a follow-on patch, a
> series of kfuncs will be added which allow programs to do bitwise
> queries on cpumasks that are either allocated by the program (in which
> case they'll be a 'struct bpf_cpumask' type that wraps a cpumask_t as
> its first element), or a cpumask that was allocated by the main kernel
> (in which case it will just be a straight cpumask_t, as in
> task->cpus_ptr).
>
> Having the two types of cpumasks allows us to distinguish between the
> two for when a cpumask is read-only vs. mutatable. A struct bpf_cpumask
> can be mutated by e.g. bpf_cpumask_clear(), whereas a regular cpumask_t
> cannot be. On the other hand, a struct bpf_cpumask can of course be
> queried in the exact same manner as a cpumask_t, with e.g.
> bpf_cpumask_test_cpu().
>
> If we were to enforce that top level types match, then a user that's
> passing a struct bpf_cpumask to a read-only cpumask_t argument would
> have to cast with something like bpf_cast_to_kern_ctx() (which itself
> would need to be updated to expect the alias, and currently it only
> accommodates a single alias per prog type). Additionally, not specifying
> KF_TRUSTED_ARGS is not an option, as some kfuncs take one argument as a
> struct bpf_cpumask *, and another as a struct cpumask *
> (i.e. cpumask_t).
>
> In order to enable this, this patch relaxes the constraint that a
> KF_TRUSTED_ARGS kfunc must have strict type matching. In order to
> try and be conservative and match existing behavior / expectations, this
> patch also enforces strict type checking for acquire kfuncs. We were
> already enforcing it for release kfuncs, so this should also improve the
> consistency of the semantics for kfuncs.
>
What you want is to simply follow type at off = 0 (but still enforce the off = 0
requirement). This is something which is currently done for bpf_sk_release (for
struct sk_common) in check_reg_type, but it is not safe in general to just open
this up for all cases. I suggest encoding this particular requirement in the
argument, and simply using triple underscore variant of the type for the special
'read_only' requirement. This will allow you to use same type in your BPF C
program, while allowing verifier to see them as two different types in kfunc
parameters. Then just relax type following for the particular argument so that
one can pass cpumask_t___ro to kfunc expecting cpumask_t (but only at off = 0,
it just visits first member after failing match on top level type). off = 0
check is still necessary.
So offset checks still need to be according to OBJ_RELEASE but you can relax
strict_type_match bool for the particular arg when calling btf_struct_ids_match.
All code in your tests will then deal with a cpumask_t type only, including in
kfunc declarations. Same as bpf_nf selftests which don't cast from/to
nf_conn___init and only deal with nf_conn pointers even though semantics differ
depending on how it is used and passed around. Overall more convenient and
simple to use.
On Fri, Jan 20, 2023 at 05:28:28AM IST, David Vernet wrote:
> KF_TRUSTED_ARGS kfuncs currently have a subtle and insidious bug in
> validating pointers to scalars. Say that you have a kfunc like the
> following, which takes an array as the first argument:
>
> bool bpf_cpumask_empty(const struct cpumask *cpumask)
> {
> return cpumask_empty(cpumask);
> }
>
> ...
> BTF_ID_FLAGS(func, bpf_cpumask_empty, KF_TRUSTED_ARGS)
> ...
>
This is known and expected.
> If a BPF program were to invoke the kfunc with a NULL argument, it would
> crash the kernel. The reason is that struct cpumask is defined as a
> bitmap, which is itself defined as an array, and is accessed as a memory
> address memory by bitmap operations. So when the verifier analyzes the
> register, it interprets it as a pointer to a scalar struct, which is an
> array of size 8. check_mem_reg() then sees that the register is NULL,
> and returns 0, and the kfunc crashes when it passes it down to the
> cpumask wrappers.
>
> To fix this, this patch adds a check for KF_ARG_PTR_TO_MEM which
> verifies that the register doesn't contain a NULL pointer if the kfunc
> is KF_TRUSTED_ARGS.
>
> This may or may not be desired behavior. Some kfuncs may want to
> allow callers to pass NULL-able pointers. An alternative would be adding
> a KF_NOT_NULL flag and leaving KF_TRUSTED_ARGS alone, though given that
> a kfunc is saying it wants to "trust" an argument, it seems reasonable
> to prevent NULL.
>
> Signed-off-by: David Vernet <[email protected]>
> ---
> kernel/bpf/verifier.c | 5 +++++
> 1 file changed, 5 insertions(+)
>
> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> index 9fa101420046..28ccb92ebe65 100644
> --- a/kernel/bpf/verifier.c
> +++ b/kernel/bpf/verifier.c
> @@ -9092,6 +9092,11 @@ static int check_kfunc_args(struct bpf_verifier_env *env, struct bpf_kfunc_call_
> i, btf_type_str(ref_t), ref_tname, PTR_ERR(resolve_ret));
> return -EINVAL;
> }
> + if (is_kfunc_trusted_args(meta) && register_is_null(reg)) {
> + verbose(env, "NULL pointer passed to trusted arg%d\n", i);
> + return -EACCES;
> + }
> +
Current patch looks like a stop gap solution. Just checking for register_is_null
is not enough, what about PTR_MAYBE_NULL? That can also be passed. Some
arguments can be both PTR_TO_BTF_ID and PTR_TO_MEM, so it will be bypassed in
the other case because this check is limited to KF_ARG_PTR_TO_MEM. It would
probably be better to disallow NULL by default and explicitly tag the argument
with __or_null to indicate that NULL is accepted. Seems like a much better
default to me.
On Fri, Jan 20, 2023 at 10:51:01AM +0530, Kumar Kartikeya Dwivedi wrote:
> On Fri, Jan 20, 2023 at 05:28:28AM IST, David Vernet wrote:
> > KF_TRUSTED_ARGS kfuncs currently have a subtle and insidious bug in
> > validating pointers to scalars. Say that you have a kfunc like the
> > following, which takes an array as the first argument:
> >
> > bool bpf_cpumask_empty(const struct cpumask *cpumask)
> > {
> > return cpumask_empty(cpumask);
> > }
> >
> > ...
> > BTF_ID_FLAGS(func, bpf_cpumask_empty, KF_TRUSTED_ARGS)
> > ...
> >
>
> This is known and expected.
Expected? So kfuncs are expected to always check whether any pointer to
a scalar is non-NULL? Seems like a poor UX. I like your suggestion below
to address it so it's opt-in.
> > If a BPF program were to invoke the kfunc with a NULL argument, it would
> > crash the kernel. The reason is that struct cpumask is defined as a
> > bitmap, which is itself defined as an array, and is accessed as a memory
> > address memory by bitmap operations. So when the verifier analyzes the
> > register, it interprets it as a pointer to a scalar struct, which is an
> > array of size 8. check_mem_reg() then sees that the register is NULL,
> > and returns 0, and the kfunc crashes when it passes it down to the
> > cpumask wrappers.
> >
> > To fix this, this patch adds a check for KF_ARG_PTR_TO_MEM which
> > verifies that the register doesn't contain a NULL pointer if the kfunc
> > is KF_TRUSTED_ARGS.
> >
> > This may or may not be desired behavior. Some kfuncs may want to
> > allow callers to pass NULL-able pointers. An alternative would be adding
> > a KF_NOT_NULL flag and leaving KF_TRUSTED_ARGS alone, though given that
> > a kfunc is saying it wants to "trust" an argument, it seems reasonable
> > to prevent NULL.
> >
> > Signed-off-by: David Vernet <[email protected]>
> > ---
> > kernel/bpf/verifier.c | 5 +++++
> > 1 file changed, 5 insertions(+)
> >
> > diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> > index 9fa101420046..28ccb92ebe65 100644
> > --- a/kernel/bpf/verifier.c
> > +++ b/kernel/bpf/verifier.c
> > @@ -9092,6 +9092,11 @@ static int check_kfunc_args(struct bpf_verifier_env *env, struct bpf_kfunc_call_
> > i, btf_type_str(ref_t), ref_tname, PTR_ERR(resolve_ret));
> > return -EINVAL;
> > }
> > + if (is_kfunc_trusted_args(meta) && register_is_null(reg)) {
> > + verbose(env, "NULL pointer passed to trusted arg%d\n", i);
> > + return -EACCES;
> > + }
> > +
>
> Current patch looks like a stop gap solution. Just checking for register_is_null
> is not enough, what about PTR_MAYBE_NULL? That can also be passed. Some
> arguments can be both PTR_TO_BTF_ID and PTR_TO_MEM, so it will be bypassed in
> the other case because this check is limited to KF_ARG_PTR_TO_MEM. It would
This wouldn't happen if you had a PTR_TO_BTF_ID, would it? In that case
you could just rely on PTR_TRUSTED. IMO that really should be the
default for any pointer argument. If you have KF_ARGS_TRUSTED, the kfunc
should just be able to assume that the pointers have been verified.
Regardless, you're right that this isn't a complete solution because of
PTR_MAYBE_NULL. I'm fine with adding an __or_null suffix that allows
NULL, and we disallow NULL or PTR_MAYBE_NULL from any KF_TRUSTED_ARGS
argument otherwise. Or we just also disallow PTR_MAYBE_NULL and try to
hold off on adding yet another suffix until we have proper per-arg kfunc
definitions.
> probably be better to disallow NULL by default and explicitly tag the argument
> with __or_null to indicate that NULL is accepted. Seems like a much better
> default to me.
On Thu, Jan 19, 2023 at 05:58:29PM -0600, David Vernet wrote:
> silently check for and ignore these cases at runtime. When we have e.g.
> per-argument kfunc flags, it might be helpful to add another KF_CPU-type
> flag that specifies that the verifier should validate that it's a valid
> CPU.
...
> +void bpf_cpumask_set_cpu(u32 cpu, struct bpf_cpumask *cpumask)
> +{
> + if (!cpu_valid(cpu))
> + return;
> +
> + cpumask_set_cpu(cpu, (struct cpumask *)cpumask);
> +}
...
> +void bpf_cpumask_clear_cpu(u32 cpu, struct bpf_cpumask *cpumask)
> +{
> + if (!cpu_valid(cpu))
> + return;
I don't think we'll be able to get rid of this with KF_CPU or special suffix.
The argument might be a variable and not a constant at the verification time.
We would have to allow passing unknown vars otherwise the UX will be too restrictive,
so this run-time check would have to stay.
On Thu, Jan 19, 2023 at 05:58:30PM -0600, David Vernet wrote:
> Now that defining trusted fields in a struct is supported, we should add
> selftests to verify the behavior. This patch adds a few such testcases.
>
> Signed-off-by: David Vernet <[email protected]>
> ---
> tools/testing/selftests/bpf/DENYLIST.s390x | 1 +
> .../selftests/bpf/prog_tests/nested_trust.c | 64 +++++++++++++++++++
> .../selftests/bpf/progs/nested_trust_common.h | 12 ++++
> .../bpf/progs/nested_trust_failure.c | 33 ++++++++++
> .../bpf/progs/nested_trust_success.c | 29 +++++++++
> 5 files changed, 139 insertions(+)
> create mode 100644 tools/testing/selftests/bpf/prog_tests/nested_trust.c
> create mode 100644 tools/testing/selftests/bpf/progs/nested_trust_common.h
> create mode 100644 tools/testing/selftests/bpf/progs/nested_trust_failure.c
> create mode 100644 tools/testing/selftests/bpf/progs/nested_trust_success.c
>
> diff --git a/tools/testing/selftests/bpf/DENYLIST.s390x b/tools/testing/selftests/bpf/DENYLIST.s390x
> index 96e8371f5c2a..1cf5b94cda30 100644
> --- a/tools/testing/selftests/bpf/DENYLIST.s390x
> +++ b/tools/testing/selftests/bpf/DENYLIST.s390x
> @@ -44,6 +44,7 @@ map_kptr # failed to open_and_load program: -524
> modify_return # modify_return attach failed: -524 (trampoline)
> module_attach # skel_attach skeleton attach failed: -524 (trampoline)
> mptcp
> +nested_trust # JIT does not support calling kernel function
> netcnt # failed to load BPF skeleton 'netcnt_prog': -7 (?)
> probe_user # check_kprobe_res wrong kprobe res from probe read (?)
> rcu_read_lock # failed to find kernel BTF type ID of '__x64_sys_getpgid': -3 (?)
> diff --git a/tools/testing/selftests/bpf/prog_tests/nested_trust.c b/tools/testing/selftests/bpf/prog_tests/nested_trust.c
> new file mode 100644
> index 000000000000..4d13612f5001
> --- /dev/null
> +++ b/tools/testing/selftests/bpf/prog_tests/nested_trust.c
> @@ -0,0 +1,64 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/* Copyright (c) 2023 Meta Platforms, Inc. and affiliates. */
> +
> +#include <test_progs.h>
> +#include "nested_trust_failure.skel.h"
> +#include "nested_trust_success.skel.h"
> +
> +static const char * const nested_trust_success_testcases[] = {
> + "test_read_cpumask",
> +};
> +
> +static void verify_success(const char *prog_name)
> +{
> + struct nested_trust_success *skel;
> + struct bpf_program *prog;
> + struct bpf_link *link = NULL;
> + int status;
> + pid_t child_pid;
> +
> + skel = nested_trust_success__open();
> + if (!ASSERT_OK_PTR(skel, "nested_trust_success__open"))
> + return;
> +
> + skel->bss->pid = getpid();
> +
> + nested_trust_success__load(skel);
> + if (!ASSERT_OK_PTR(skel, "nested_trust_success__load"))
> + goto cleanup;
> +
> + prog = bpf_object__find_program_by_name(skel->obj, prog_name);
> + if (!ASSERT_OK_PTR(prog, "bpf_object__find_program_by_name"))
> + goto cleanup;
> +
> + link = bpf_program__attach(prog);
> + if (!ASSERT_OK_PTR(link, "bpf_program__attach"))
> + goto cleanup;
> +
> + child_pid = fork();
> + if (!ASSERT_GT(child_pid, -1, "child_pid"))
> + goto cleanup;
> + if (child_pid == 0)
> + _exit(0);
> + waitpid(child_pid, &status, 0);
> + ASSERT_OK(skel->bss->err, "post_wait_err");
> +
> + bpf_link__destroy(link);
> +
> +cleanup:
> + nested_trust_success__destroy(skel);
> +}
> +
> +void test_nested_trust(void)
> +{
> + int i;
> +
> + for (i = 0; i < ARRAY_SIZE(nested_trust_success_testcases); i++) {
> + if (!test__start_subtest(nested_trust_success_testcases[i]))
> + continue;
> +
> + verify_success(nested_trust_success_testcases[i]);
> + }
> +
> + RUN_TESTS(nested_trust_failure);
> +}
Hmm. I thought RUN_TESTS() works for successes too.
Looking at test_loader.c:run_subtest() that should be the case.
Could you please double check?
verify_success() above shouldn't be needed.
On Thu, Jan 19, 2023 at 09:48:23PM -0800, Alexei Starovoitov wrote:
> On Thu, Jan 19, 2023 at 05:58:29PM -0600, David Vernet wrote:
> > silently check for and ignore these cases at runtime. When we have e.g.
> > per-argument kfunc flags, it might be helpful to add another KF_CPU-type
> > flag that specifies that the verifier should validate that it's a valid
> > CPU.
>
> ...
>
> > +void bpf_cpumask_set_cpu(u32 cpu, struct bpf_cpumask *cpumask)
> > +{
> > + if (!cpu_valid(cpu))
> > + return;
> > +
> > + cpumask_set_cpu(cpu, (struct cpumask *)cpumask);
> > +}
>
> ...
>
> > +void bpf_cpumask_clear_cpu(u32 cpu, struct bpf_cpumask *cpumask)
> > +{
> > + if (!cpu_valid(cpu))
> > + return;
>
> I don't think we'll be able to get rid of this with KF_CPU or special suffix.
> The argument might be a variable and not a constant at the verification time.
> We would have to allow passing unknown vars otherwise the UX will be too restrictive,
> so this run-time check would have to stay.
Makes sense. We'll just leave it as is then and document that passing in
cpu >= nr_cpus is silently ignored for any kfunc taking a cpu argument.
On Thu, Jan 19, 2023 at 11:50:47PM -0600, David Vernet wrote:
> On Thu, Jan 19, 2023 at 09:48:23PM -0800, Alexei Starovoitov wrote:
> > On Thu, Jan 19, 2023 at 05:58:29PM -0600, David Vernet wrote:
> > > silently check for and ignore these cases at runtime. When we have e.g.
> > > per-argument kfunc flags, it might be helpful to add another KF_CPU-type
> > > flag that specifies that the verifier should validate that it's a valid
> > > CPU.
> >
> > ...
> >
> > > +void bpf_cpumask_set_cpu(u32 cpu, struct bpf_cpumask *cpumask)
> > > +{
> > > + if (!cpu_valid(cpu))
> > > + return;
> > > +
> > > + cpumask_set_cpu(cpu, (struct cpumask *)cpumask);
> > > +}
> >
> > ...
> >
> > > +void bpf_cpumask_clear_cpu(u32 cpu, struct bpf_cpumask *cpumask)
> > > +{
> > > + if (!cpu_valid(cpu))
> > > + return;
> >
> > I don't think we'll be able to get rid of this with KF_CPU or special suffix.
> > The argument might be a variable and not a constant at the verification time.
> > We would have to allow passing unknown vars otherwise the UX will be too restrictive,
> > so this run-time check would have to stay.
>
> Makes sense. We'll just leave it as is then and document that passing in
> cpu >= nr_cpus is silently ignored for any kfunc taking a cpu argument.
Eventually we can clean it up with bpf_assert infra.
On Thu, Jan 19, 2023 at 11:31:51PM -0600, David Vernet wrote:
> On Fri, Jan 20, 2023 at 10:51:01AM +0530, Kumar Kartikeya Dwivedi wrote:
> > On Fri, Jan 20, 2023 at 05:28:28AM IST, David Vernet wrote:
> > > KF_TRUSTED_ARGS kfuncs currently have a subtle and insidious bug in
> > > validating pointers to scalars. Say that you have a kfunc like the
> > > following, which takes an array as the first argument:
> > >
> > > bool bpf_cpumask_empty(const struct cpumask *cpumask)
> > > {
> > > return cpumask_empty(cpumask);
> > > }
> > >
> > > ...
> > > BTF_ID_FLAGS(func, bpf_cpumask_empty, KF_TRUSTED_ARGS)
> > > ...
> > >
> >
> > This is known and expected.
>
> Expected? So kfuncs are expected to always check whether any pointer to
> a scalar is non-NULL? Seems like a poor UX. I like your suggestion below
> to address it so it's opt-in.
I'm confused as well.
KF_TRUSTED_ARGS means that all arguments are valid and != NULL.
From our doc:
"
The KF_TRUSTED_ARGS flag is used for kfuncs taking pointer arguments. It
indicates that the all pointer arguments are valid, and that all pointers to
BTF objects have been passed in their unmodified form (that is, at a zero
"
That includes that arguments are guaranted to be != NULL.
> > > If a BPF program were to invoke the kfunc with a NULL argument, it would
> > > crash the kernel. The reason is that struct cpumask is defined as a
> > > bitmap, which is itself defined as an array, and is accessed as a memory
> > > address memory by bitmap operations. So when the verifier analyzes the
> > > register, it interprets it as a pointer to a scalar struct, which is an
> > > array of size 8. check_mem_reg() then sees that the register is NULL,
> > > and returns 0, and the kfunc crashes when it passes it down to the
> > > cpumask wrappers.
> > >
> > > To fix this, this patch adds a check for KF_ARG_PTR_TO_MEM which
> > > verifies that the register doesn't contain a NULL pointer if the kfunc
> > > is KF_TRUSTED_ARGS.
> > >
> > > This may or may not be desired behavior. Some kfuncs may want to
> > > allow callers to pass NULL-able pointers. An alternative would be adding
> > > a KF_NOT_NULL flag and leaving KF_TRUSTED_ARGS alone, though given that
> > > a kfunc is saying it wants to "trust" an argument, it seems reasonable
> > > to prevent NULL.
> > >
> > > Signed-off-by: David Vernet <[email protected]>
> > > ---
> > > kernel/bpf/verifier.c | 5 +++++
> > > 1 file changed, 5 insertions(+)
> > >
> > > diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> > > index 9fa101420046..28ccb92ebe65 100644
> > > --- a/kernel/bpf/verifier.c
> > > +++ b/kernel/bpf/verifier.c
> > > @@ -9092,6 +9092,11 @@ static int check_kfunc_args(struct bpf_verifier_env *env, struct bpf_kfunc_call_
> > > i, btf_type_str(ref_t), ref_tname, PTR_ERR(resolve_ret));
> > > return -EINVAL;
> > > }
> > > + if (is_kfunc_trusted_args(meta) && register_is_null(reg)) {
> > > + verbose(env, "NULL pointer passed to trusted arg%d\n", i);
> > > + return -EACCES;
> > > + }
> > > +
> >
> > Current patch looks like a stop gap solution. Just checking for register_is_null
> > is not enough, what about PTR_MAYBE_NULL? That can also be passed. Some
> > arguments can be both PTR_TO_BTF_ID and PTR_TO_MEM, so it will be bypassed in
> > the other case because this check is limited to KF_ARG_PTR_TO_MEM. It would
>
> This wouldn't happen if you had a PTR_TO_BTF_ID, would it? In that case
> you could just rely on PTR_TRUSTED. IMO that really should be the
> default for any pointer argument. If you have KF_ARGS_TRUSTED, the kfunc
> should just be able to assume that the pointers have been verified.
+1
> Regardless, you're right that this isn't a complete solution because of
> PTR_MAYBE_NULL. I'm fine with adding an __or_null suffix that allows
> NULL, and we disallow NULL or PTR_MAYBE_NULL from any KF_TRUSTED_ARGS
> argument otherwise. Or we just also disallow PTR_MAYBE_NULL and try to
> hold off on adding yet another suffix until we have proper per-arg kfunc
> definitions.
PTR_MAYBE_NULL should not be allowed into kfunc with KF_TRUSTED_ARGS.
On Fri, Jan 20, 2023 at 11:26:37AM +0530, Kumar Kartikeya Dwivedi wrote:
> On Fri, 20 Jan 2023 at 11:10, Alexei Starovoitov
> <[email protected]> wrote:
> >
> > On Thu, Jan 19, 2023 at 11:23:18PM -0600, David Vernet wrote:
> > > On Fri, Jan 20, 2023 at 10:28:15AM +0530, Kumar Kartikeya Dwivedi wrote:
> > > > On Fri, Jan 20, 2023 at 05:28:27AM IST, David Vernet wrote:
> > > > > When validating BTF types for KF_TRUSTED_ARGS kfuncs, the verifier
> > > > > currently enforces that the top-level type must match when calling
> > > > > the kfunc. In other words, the verifier does not allow the BPF program
> > > > > to pass a bitwise equivalent struct, despite it being functionally safe.
> > > > > For example, if you have the following type:
> > > > >
> > > > > struct nf_conn___init {
> > > > > struct nf_conn ct;
> > > > > };
> > > > >
> > > > > It would be safe to pass a struct nf_conn___init to a kfunc expecting a
> > > > > struct nf_conn.
> > > >
> > > > Just running bpf_nf selftest would have shown this is false.
> > >
> > > And I feel silly, because I did run them, and could have sworn they
> > > passed...looking now at the change_status_after_alloc testcase I see
> > > you're of course correct. Very poor example, thank you for pointing it
> > > out.
> > >
> > > >
> > > > > Being able to do this will be useful for certain types
> > > > > of kfunc / kptrs enabled by BPF. For example, in a follow-on patch, a
> > > > > series of kfuncs will be added which allow programs to do bitwise
> > > > > queries on cpumasks that are either allocated by the program (in which
> > > > > case they'll be a 'struct bpf_cpumask' type that wraps a cpumask_t as
> > > > > its first element), or a cpumask that was allocated by the main kernel
> > > > > (in which case it will just be a straight cpumask_t, as in
> > > > > task->cpus_ptr).
> > > > >
> > > > > Having the two types of cpumasks allows us to distinguish between the
> > > > > two for when a cpumask is read-only vs. mutatable. A struct bpf_cpumask
> > > > > can be mutated by e.g. bpf_cpumask_clear(), whereas a regular cpumask_t
> > > > > cannot be. On the other hand, a struct bpf_cpumask can of course be
> > > > > queried in the exact same manner as a cpumask_t, with e.g.
> > > > > bpf_cpumask_test_cpu().
> > > > >
> > > > > If we were to enforce that top level types match, then a user that's
> > > > > passing a struct bpf_cpumask to a read-only cpumask_t argument would
> > > > > have to cast with something like bpf_cast_to_kern_ctx() (which itself
> > > > > would need to be updated to expect the alias, and currently it only
> > > > > accommodates a single alias per prog type). Additionally, not specifying
> > > > > KF_TRUSTED_ARGS is not an option, as some kfuncs take one argument as a
> > > > > struct bpf_cpumask *, and another as a struct cpumask *
> > > > > (i.e. cpumask_t).
> > > > >
> > > > > In order to enable this, this patch relaxes the constraint that a
> > > > > KF_TRUSTED_ARGS kfunc must have strict type matching. In order to
> > > > > try and be conservative and match existing behavior / expectations, this
> > > > > patch also enforces strict type checking for acquire kfuncs. We were
> > > > > already enforcing it for release kfuncs, so this should also improve the
> > > > > consistency of the semantics for kfuncs.
> > > > >
> > > >
> > > > What you want is to simply follow type at off = 0 (but still enforce the off = 0
> > > > requirement). This is something which is currently done for bpf_sk_release (for
> > > > struct sk_common) in check_reg_type, but it is not safe in general to just open
> > > > this up for all cases. I suggest encoding this particular requirement in the
> > > > argument, and simply using triple underscore variant of the type for the special
> > > > 'read_only' requirement. This will allow you to use same type in your BPF C
> > > > program, while allowing verifier to see them as two different types in kfunc
> > > > parameters. Then just relax type following for the particular argument so that
> > > > one can pass cpumask_t___ro to kfunc expecting cpumask_t (but only at off = 0,
> > > > it just visits first member after failing match on top level type). off = 0
> > > > check is still necessary.
> > >
> > > Sigh, yeah, another ___ workaround but I agree it's probably the best we
> > > can do for now, and in general seems pretty useful. Obviously preferable
> > > to this patch which just doesn't work. Alexei, are you OK with this? If
> > > so, I'll take this approach for v2.
> >
> > We decided to rely on strict type match when we introduced 'struct nf_conn___init',
> > but with that we twisted the C standard to, what looks to be, a wrong direction.
> >
> > For definition:
> > struct nf_conn___init {
> > struct nf_conn ct;
> > };
> > if a kfunc accepts a pointer to nf_conn it should always accept a pointer to nf_conn__init
> > for both read and write, because in C that's valid and safe type cast.
> >
>
> The intention of this nf_conn___init was to be invisible to the user.
> In selftests there is no trace of nf_conn___init. It is only for
> enforcing semantics by virtue of type safety in the verifier.
>
> Allocated but not inserted nf_conn -> nf_conn___init
> Inserted/looked up nf_conn -> nf_conn
>
> We can't pass e.g. nf_conn___init * to a function expecting nf_conn *.
> The allocated nf_conn may not yet be fully initialized. It is only
> after bpf_ct_insert_entry takes the nf_conn___init * and returns
> inserted nf_conn * should it be allowed.
Yes. I know and agree with all of the above.
> But for the user in BPF C it will be the same nf_conn. The verifier
> can enforce different semantics on the underlying type's usage in
> kfuncs etc, while the user performs normal direct access to the
> nf_conn.
>
> It will be the same case here, except you also introduce the case of
> kfuncs that are 'polymorphic' and can take both. Relaxing
> 'strict_type_match' for that arg and placing the type of member you
> wish to convert the pointer to gives you such polymorphism. But it's
> not correct to do for nf_conn___init to nf_conn, at least not by
> default.
Yes. Agree. I used unfortunate example in the previous reply with nf_conn___init.
I meant to say:
For definition:
struct nf_conn_init {
struct nf_conn ct;
};
if a kfunc accepts a pointer to nf_conn it should always accept a pointer to nf_conn_init
for both read and write, because in C that's valid and safe type cast.
Meainng that C rules apply.
Our triple underscore is special, because it's the "same type".
In the 2nd part of my reply I'm proposing to use the whole suffix "___init" to indicate that.
I think you're arguing that just "___" part is enough to enforce strict match.
Matching foo___flavor with foo should not be allowed.
While passing struct foo_flavor {struct foo;} into a kfunc that accepts 'struct foo'
is safe.
If so, I'm fine with such approach.
On Thu, Jan 19, 2023 at 09:51:49PM -0800, Alexei Starovoitov wrote:
> On Thu, Jan 19, 2023 at 05:58:30PM -0600, David Vernet wrote:
> > Now that defining trusted fields in a struct is supported, we should add
> > selftests to verify the behavior. This patch adds a few such testcases.
> >
> > Signed-off-by: David Vernet <[email protected]>
> > ---
> > tools/testing/selftests/bpf/DENYLIST.s390x | 1 +
> > .../selftests/bpf/prog_tests/nested_trust.c | 64 +++++++++++++++++++
> > .../selftests/bpf/progs/nested_trust_common.h | 12 ++++
> > .../bpf/progs/nested_trust_failure.c | 33 ++++++++++
> > .../bpf/progs/nested_trust_success.c | 29 +++++++++
> > 5 files changed, 139 insertions(+)
> > create mode 100644 tools/testing/selftests/bpf/prog_tests/nested_trust.c
> > create mode 100644 tools/testing/selftests/bpf/progs/nested_trust_common.h
> > create mode 100644 tools/testing/selftests/bpf/progs/nested_trust_failure.c
> > create mode 100644 tools/testing/selftests/bpf/progs/nested_trust_success.c
> >
> > diff --git a/tools/testing/selftests/bpf/DENYLIST.s390x b/tools/testing/selftests/bpf/DENYLIST.s390x
> > index 96e8371f5c2a..1cf5b94cda30 100644
> > --- a/tools/testing/selftests/bpf/DENYLIST.s390x
> > +++ b/tools/testing/selftests/bpf/DENYLIST.s390x
> > @@ -44,6 +44,7 @@ map_kptr # failed to open_and_load program: -524
> > modify_return # modify_return attach failed: -524 (trampoline)
> > module_attach # skel_attach skeleton attach failed: -524 (trampoline)
> > mptcp
> > +nested_trust # JIT does not support calling kernel function
> > netcnt # failed to load BPF skeleton 'netcnt_prog': -7 (?)
> > probe_user # check_kprobe_res wrong kprobe res from probe read (?)
> > rcu_read_lock # failed to find kernel BTF type ID of '__x64_sys_getpgid': -3 (?)
> > diff --git a/tools/testing/selftests/bpf/prog_tests/nested_trust.c b/tools/testing/selftests/bpf/prog_tests/nested_trust.c
> > new file mode 100644
> > index 000000000000..4d13612f5001
> > --- /dev/null
> > +++ b/tools/testing/selftests/bpf/prog_tests/nested_trust.c
> > @@ -0,0 +1,64 @@
> > +// SPDX-License-Identifier: GPL-2.0
> > +/* Copyright (c) 2023 Meta Platforms, Inc. and affiliates. */
> > +
> > +#include <test_progs.h>
> > +#include "nested_trust_failure.skel.h"
> > +#include "nested_trust_success.skel.h"
> > +
> > +static const char * const nested_trust_success_testcases[] = {
> > + "test_read_cpumask",
> > +};
> > +
> > +static void verify_success(const char *prog_name)
> > +{
> > + struct nested_trust_success *skel;
> > + struct bpf_program *prog;
> > + struct bpf_link *link = NULL;
> > + int status;
> > + pid_t child_pid;
> > +
> > + skel = nested_trust_success__open();
> > + if (!ASSERT_OK_PTR(skel, "nested_trust_success__open"))
> > + return;
> > +
> > + skel->bss->pid = getpid();
> > +
> > + nested_trust_success__load(skel);
> > + if (!ASSERT_OK_PTR(skel, "nested_trust_success__load"))
> > + goto cleanup;
> > +
> > + prog = bpf_object__find_program_by_name(skel->obj, prog_name);
> > + if (!ASSERT_OK_PTR(prog, "bpf_object__find_program_by_name"))
> > + goto cleanup;
> > +
> > + link = bpf_program__attach(prog);
> > + if (!ASSERT_OK_PTR(link, "bpf_program__attach"))
> > + goto cleanup;
> > +
> > + child_pid = fork();
> > + if (!ASSERT_GT(child_pid, -1, "child_pid"))
> > + goto cleanup;
> > + if (child_pid == 0)
> > + _exit(0);
> > + waitpid(child_pid, &status, 0);
> > + ASSERT_OK(skel->bss->err, "post_wait_err");
> > +
> > + bpf_link__destroy(link);
> > +
> > +cleanup:
> > + nested_trust_success__destroy(skel);
> > +}
> > +
> > +void test_nested_trust(void)
> > +{
> > + int i;
> > +
> > + for (i = 0; i < ARRAY_SIZE(nested_trust_success_testcases); i++) {
> > + if (!test__start_subtest(nested_trust_success_testcases[i]))
> > + continue;
> > +
> > + verify_success(nested_trust_success_testcases[i]);
> > + }
> > +
> > + RUN_TESTS(nested_trust_failure);
> > +}
>
> Hmm. I thought RUN_TESTS() works for successes too.
> Looking at test_loader.c:run_subtest() that should be the case.
> Could you please double check?
> verify_success() above shouldn't be needed.
Yes, looks like RUN_TESTS() works for success cases too, it just isn't
used anywhere yet. I expect it won't be super commonly used given that
it only loads the program and tests the verifier rather than doing any
runtime validation, but that's exactly what we want for this so I'll
update that in v2. I'll plan on leaving the cpumask success cases as
they are unless you object so that we can get runtime coverage as well.
On Fri, 20 Jan 2023 at 11:10, Alexei Starovoitov
<[email protected]> wrote:
>
> On Thu, Jan 19, 2023 at 11:23:18PM -0600, David Vernet wrote:
> > On Fri, Jan 20, 2023 at 10:28:15AM +0530, Kumar Kartikeya Dwivedi wrote:
> > > On Fri, Jan 20, 2023 at 05:28:27AM IST, David Vernet wrote:
> > > > When validating BTF types for KF_TRUSTED_ARGS kfuncs, the verifier
> > > > currently enforces that the top-level type must match when calling
> > > > the kfunc. In other words, the verifier does not allow the BPF program
> > > > to pass a bitwise equivalent struct, despite it being functionally safe.
> > > > For example, if you have the following type:
> > > >
> > > > struct nf_conn___init {
> > > > struct nf_conn ct;
> > > > };
> > > >
> > > > It would be safe to pass a struct nf_conn___init to a kfunc expecting a
> > > > struct nf_conn.
> > >
> > > Just running bpf_nf selftest would have shown this is false.
> >
> > And I feel silly, because I did run them, and could have sworn they
> > passed...looking now at the change_status_after_alloc testcase I see
> > you're of course correct. Very poor example, thank you for pointing it
> > out.
> >
> > >
> > > > Being able to do this will be useful for certain types
> > > > of kfunc / kptrs enabled by BPF. For example, in a follow-on patch, a
> > > > series of kfuncs will be added which allow programs to do bitwise
> > > > queries on cpumasks that are either allocated by the program (in which
> > > > case they'll be a 'struct bpf_cpumask' type that wraps a cpumask_t as
> > > > its first element), or a cpumask that was allocated by the main kernel
> > > > (in which case it will just be a straight cpumask_t, as in
> > > > task->cpus_ptr).
> > > >
> > > > Having the two types of cpumasks allows us to distinguish between the
> > > > two for when a cpumask is read-only vs. mutatable. A struct bpf_cpumask
> > > > can be mutated by e.g. bpf_cpumask_clear(), whereas a regular cpumask_t
> > > > cannot be. On the other hand, a struct bpf_cpumask can of course be
> > > > queried in the exact same manner as a cpumask_t, with e.g.
> > > > bpf_cpumask_test_cpu().
> > > >
> > > > If we were to enforce that top level types match, then a user that's
> > > > passing a struct bpf_cpumask to a read-only cpumask_t argument would
> > > > have to cast with something like bpf_cast_to_kern_ctx() (which itself
> > > > would need to be updated to expect the alias, and currently it only
> > > > accommodates a single alias per prog type). Additionally, not specifying
> > > > KF_TRUSTED_ARGS is not an option, as some kfuncs take one argument as a
> > > > struct bpf_cpumask *, and another as a struct cpumask *
> > > > (i.e. cpumask_t).
> > > >
> > > > In order to enable this, this patch relaxes the constraint that a
> > > > KF_TRUSTED_ARGS kfunc must have strict type matching. In order to
> > > > try and be conservative and match existing behavior / expectations, this
> > > > patch also enforces strict type checking for acquire kfuncs. We were
> > > > already enforcing it for release kfuncs, so this should also improve the
> > > > consistency of the semantics for kfuncs.
> > > >
> > >
> > > What you want is to simply follow type at off = 0 (but still enforce the off = 0
> > > requirement). This is something which is currently done for bpf_sk_release (for
> > > struct sk_common) in check_reg_type, but it is not safe in general to just open
> > > this up for all cases. I suggest encoding this particular requirement in the
> > > argument, and simply using triple underscore variant of the type for the special
> > > 'read_only' requirement. This will allow you to use same type in your BPF C
> > > program, while allowing verifier to see them as two different types in kfunc
> > > parameters. Then just relax type following for the particular argument so that
> > > one can pass cpumask_t___ro to kfunc expecting cpumask_t (but only at off = 0,
> > > it just visits first member after failing match on top level type). off = 0
> > > check is still necessary.
> >
> > Sigh, yeah, another ___ workaround but I agree it's probably the best we
> > can do for now, and in general seems pretty useful. Obviously preferable
> > to this patch which just doesn't work. Alexei, are you OK with this? If
> > so, I'll take this approach for v2.
>
> We decided to rely on strict type match when we introduced 'struct nf_conn___init',
> but with that we twisted the C standard to, what looks to be, a wrong direction.
>
> For definition:
> struct nf_conn___init {
> struct nf_conn ct;
> };
> if a kfunc accepts a pointer to nf_conn it should always accept a pointer to nf_conn__init
> for both read and write, because in C that's valid and safe type cast.
>
The intention of this nf_conn___init was to be invisible to the user.
In selftests there is no trace of nf_conn___init. It is only for
enforcing semantics by virtue of type safety in the verifier.
Allocated but not inserted nf_conn -> nf_conn___init
Inserted/looked up nf_conn -> nf_conn
We can't pass e.g. nf_conn___init * to a function expecting nf_conn *.
The allocated nf_conn may not yet be fully initialized. It is only
after bpf_ct_insert_entry takes the nf_conn___init * and returns
inserted nf_conn * should it be allowed.
But for the user in BPF C it will be the same nf_conn. The verifier
can enforce different semantics on the underlying type's usage in
kfuncs etc, while the user performs normal direct access to the
nf_conn.
It will be the same case here, except you also introduce the case of
kfuncs that are 'polymorphic' and can take both. Relaxing
'strict_type_match' for that arg and placing the type of member you
wish to convert the pointer to gives you such polymorphism. But it's
not correct to do for nf_conn___init to nf_conn, at least not by
default.
In the future we may do:
union bpf_subtype {
type A;
type B;
type C;
};
And using the relaxed rule allows all types at off = 0 to be passed to
kfuncs expecting type A/B/C for bpf_subtype *.
bpf_subtype is a fake type. We're just using the type system to
enforce different API usage for the same underlying kernel type.
> We can fix this design issue by saying that '___init' suffix is special and
> C type casting rules don't apply to it.
> In all other cases bpf_cpumask/cpumask would should allow it.
>
I'm just saying the triple underscore is not visible to the user.
You can declare kfunc that is:
struct foo___x *foo_alloc(void); in the kernel as
struct foo *foo_alloc(void); in BPF program and avoid all the
casting/ugliness and still enforce semantics around use.
On Thu, Jan 19, 2023 at 09:59:06PM -0800, Alexei Starovoitov wrote:
> On Thu, Jan 19, 2023 at 05:58:32PM -0600, David Vernet wrote:
> > +
> > +For example:
> > +
> > +.. code-block:: c
> > +
> > + /**
> > + * A trivial example tracepoint program that shows how to
> > + * acquire and release a struct bpf_cpumask *.
> > + */
> > + SEC("tp_btf/task_newtask")
> > + int BPF_PROG(task_acquire_release_example, struct task_struct *task, u64 clone_flags)
> > + {
> > + struct bpf_cpumask *cpumask, *acquired;
> > +
> > + cpumask = bpf_cpumask_create();
> > + if (!cpumask)
> > + return 1;
> > +
> > + acquired = bpf_cpumask_acquire(cpumask);
> > + bpf_cpumask_release(cpumask);
> > + bpf_cpumask_acquire(acquired);
> > +
> > + return 0;
> > + }
>
> As the first example in the doc it was... alarming :)
> I've read it as it says that bpf_cpumask_acquire has to be called on
> freshly created cpumask before it can be used.
That's fair, I treated this example like a testcase :-)
> I've started to doubt by code reading skills of the previous patches :)
> A basic example is probably necessary to introduce the concept.
> Or this example should have bpf_cpumask_set_cpu right after create and
> more alu ops after release with comments to demonstrate the point.
Makes sense. I'll update the examples to be more clear and reflect
realistic use cases.
Hi David,
Thank you for the patch! Perhaps something to improve:
[auto build test WARNING on bpf-next/master]
url: https://github.com/intel-lab-lkp/linux/commits/David-Vernet/bpf-Enable-annotating-trusted-nested-pointers/20230120-080139
base: https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git master
patch link: https://lore.kernel.org/r/20230119235833.2948341-5-void%40manifault.com
patch subject: [PATCH bpf-next 4/8] bpf: Enable cpumasks to be queried and used as kptrs
config: i386-randconfig-a015 (https://download.01.org/0day-ci/archive/20230120/[email protected]/config)
compiler: clang version 14.0.6 (https://github.com/llvm/llvm-project f28c006a5895fc0e329fe15fead81e37457cb1d1)
reproduce (this is a W=1 build):
wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
chmod +x ~/bin/make.cross
# https://github.com/intel-lab-lkp/linux/commit/11541205c58f2226e5ffbc5967317469d65efac6
git remote add linux-review https://github.com/intel-lab-lkp/linux
git fetch --no-tags linux-review David-Vernet/bpf-Enable-annotating-trusted-nested-pointers/20230120-080139
git checkout 11541205c58f2226e5ffbc5967317469d65efac6
# save the config file
mkdir build_dir && cp config build_dir/.config
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross W=1 O=build_dir ARCH=i386 olddefconfig
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross W=1 O=build_dir ARCH=i386 SHELL=/bin/bash
If you fix the issue, kindly add following tag where applicable
| Reported-by: kernel test robot <[email protected]>
All warnings (new ones prefixed by >>):
>> kernel/bpf/cpumask.c:38:21: warning: no previous prototype for function 'bpf_cpumask_create' [-Wmissing-prototypes]
struct bpf_cpumask *bpf_cpumask_create(void)
^
kernel/bpf/cpumask.c:38:1: note: declare 'static' if the function is not intended to be used outside of this translation unit
struct bpf_cpumask *bpf_cpumask_create(void)
^
static
>> kernel/bpf/cpumask.c:52:21: warning: no previous prototype for function 'bpf_cpumask_acquire' [-Wmissing-prototypes]
struct bpf_cpumask *bpf_cpumask_acquire(struct bpf_cpumask *cpumask)
^
kernel/bpf/cpumask.c:52:1: note: declare 'static' if the function is not intended to be used outside of this translation unit
struct bpf_cpumask *bpf_cpumask_acquire(struct bpf_cpumask *cpumask)
^
static
>> kernel/bpf/cpumask.c:58:21: warning: no previous prototype for function 'bpf_cpumask_kptr_get' [-Wmissing-prototypes]
struct bpf_cpumask *bpf_cpumask_kptr_get(struct bpf_cpumask **cpumaskp)
^
kernel/bpf/cpumask.c:58:1: note: declare 'static' if the function is not intended to be used outside of this translation unit
struct bpf_cpumask *bpf_cpumask_kptr_get(struct bpf_cpumask **cpumaskp)
^
static
>> kernel/bpf/cpumask.c:76:6: warning: no previous prototype for function 'bpf_cpumask_release' [-Wmissing-prototypes]
void bpf_cpumask_release(struct bpf_cpumask *cpumask)
^
kernel/bpf/cpumask.c:76:1: note: declare 'static' if the function is not intended to be used outside of this translation unit
void bpf_cpumask_release(struct bpf_cpumask *cpumask)
^
static
>> kernel/bpf/cpumask.c:88:5: warning: no previous prototype for function 'bpf_cpumask_first' [-Wmissing-prototypes]
u32 bpf_cpumask_first(const struct cpumask *cpumask)
^
kernel/bpf/cpumask.c:88:1: note: declare 'static' if the function is not intended to be used outside of this translation unit
u32 bpf_cpumask_first(const struct cpumask *cpumask)
^
static
>> kernel/bpf/cpumask.c:93:5: warning: no previous prototype for function 'bpf_cpumask_first_zero' [-Wmissing-prototypes]
u32 bpf_cpumask_first_zero(const struct cpumask *cpumask)
^
kernel/bpf/cpumask.c:93:1: note: declare 'static' if the function is not intended to be used outside of this translation unit
u32 bpf_cpumask_first_zero(const struct cpumask *cpumask)
^
static
>> kernel/bpf/cpumask.c:98:6: warning: no previous prototype for function 'bpf_cpumask_set_cpu' [-Wmissing-prototypes]
void bpf_cpumask_set_cpu(u32 cpu, struct bpf_cpumask *cpumask)
^
kernel/bpf/cpumask.c:98:1: note: declare 'static' if the function is not intended to be used outside of this translation unit
void bpf_cpumask_set_cpu(u32 cpu, struct bpf_cpumask *cpumask)
^
static
>> kernel/bpf/cpumask.c:106:6: warning: no previous prototype for function 'bpf_cpumask_clear_cpu' [-Wmissing-prototypes]
void bpf_cpumask_clear_cpu(u32 cpu, struct bpf_cpumask *cpumask)
^
kernel/bpf/cpumask.c:106:1: note: declare 'static' if the function is not intended to be used outside of this translation unit
void bpf_cpumask_clear_cpu(u32 cpu, struct bpf_cpumask *cpumask)
^
static
>> kernel/bpf/cpumask.c:114:6: warning: no previous prototype for function 'bpf_cpumask_test_cpu' [-Wmissing-prototypes]
bool bpf_cpumask_test_cpu(u32 cpu, const struct cpumask *cpumask)
^
kernel/bpf/cpumask.c:114:1: note: declare 'static' if the function is not intended to be used outside of this translation unit
bool bpf_cpumask_test_cpu(u32 cpu, const struct cpumask *cpumask)
^
static
>> kernel/bpf/cpumask.c:122:6: warning: no previous prototype for function 'bpf_cpumask_test_and_set_cpu' [-Wmissing-prototypes]
bool bpf_cpumask_test_and_set_cpu(u32 cpu, struct bpf_cpumask *cpumask)
^
kernel/bpf/cpumask.c:122:1: note: declare 'static' if the function is not intended to be used outside of this translation unit
bool bpf_cpumask_test_and_set_cpu(u32 cpu, struct bpf_cpumask *cpumask)
^
static
>> kernel/bpf/cpumask.c:130:6: warning: no previous prototype for function 'bpf_cpumask_test_and_clear_cpu' [-Wmissing-prototypes]
bool bpf_cpumask_test_and_clear_cpu(u32 cpu, struct bpf_cpumask *cpumask)
^
kernel/bpf/cpumask.c:130:1: note: declare 'static' if the function is not intended to be used outside of this translation unit
bool bpf_cpumask_test_and_clear_cpu(u32 cpu, struct bpf_cpumask *cpumask)
^
static
>> kernel/bpf/cpumask.c:138:6: warning: no previous prototype for function 'bpf_cpumask_setall' [-Wmissing-prototypes]
void bpf_cpumask_setall(struct bpf_cpumask *cpumask)
^
kernel/bpf/cpumask.c:138:1: note: declare 'static' if the function is not intended to be used outside of this translation unit
void bpf_cpumask_setall(struct bpf_cpumask *cpumask)
^
static
>> kernel/bpf/cpumask.c:143:6: warning: no previous prototype for function 'bpf_cpumask_clear' [-Wmissing-prototypes]
void bpf_cpumask_clear(struct bpf_cpumask *cpumask)
^
kernel/bpf/cpumask.c:143:1: note: declare 'static' if the function is not intended to be used outside of this translation unit
void bpf_cpumask_clear(struct bpf_cpumask *cpumask)
^
static
>> kernel/bpf/cpumask.c:148:6: warning: no previous prototype for function 'bpf_cpumask_and' [-Wmissing-prototypes]
bool bpf_cpumask_and(struct bpf_cpumask *dst,
^
kernel/bpf/cpumask.c:148:1: note: declare 'static' if the function is not intended to be used outside of this translation unit
bool bpf_cpumask_and(struct bpf_cpumask *dst,
^
static
>> kernel/bpf/cpumask.c:155:6: warning: no previous prototype for function 'bpf_cpumask_or' [-Wmissing-prototypes]
void bpf_cpumask_or(struct bpf_cpumask *dst,
^
kernel/bpf/cpumask.c:155:1: note: declare 'static' if the function is not intended to be used outside of this translation unit
void bpf_cpumask_or(struct bpf_cpumask *dst,
^
static
>> kernel/bpf/cpumask.c:162:6: warning: no previous prototype for function 'bpf_cpumask_xor' [-Wmissing-prototypes]
void bpf_cpumask_xor(struct bpf_cpumask *dst,
^
kernel/bpf/cpumask.c:162:1: note: declare 'static' if the function is not intended to be used outside of this translation unit
void bpf_cpumask_xor(struct bpf_cpumask *dst,
^
static
>> kernel/bpf/cpumask.c:169:6: warning: no previous prototype for function 'bpf_cpumask_equal' [-Wmissing-prototypes]
bool bpf_cpumask_equal(const struct cpumask *src1, const struct cpumask *src2)
^
kernel/bpf/cpumask.c:169:1: note: declare 'static' if the function is not intended to be used outside of this translation unit
bool bpf_cpumask_equal(const struct cpumask *src1, const struct cpumask *src2)
^
static
>> kernel/bpf/cpumask.c:174:6: warning: no previous prototype for function 'bpf_cpumask_intersects' [-Wmissing-prototypes]
bool bpf_cpumask_intersects(const struct cpumask *src1, const struct cpumask *src2)
^
kernel/bpf/cpumask.c:174:1: note: declare 'static' if the function is not intended to be used outside of this translation unit
bool bpf_cpumask_intersects(const struct cpumask *src1, const struct cpumask *src2)
^
static
>> kernel/bpf/cpumask.c:179:6: warning: no previous prototype for function 'bpf_cpumask_subset' [-Wmissing-prototypes]
bool bpf_cpumask_subset(const struct cpumask *src1, const struct cpumask *src2)
^
kernel/bpf/cpumask.c:179:1: note: declare 'static' if the function is not intended to be used outside of this translation unit
bool bpf_cpumask_subset(const struct cpumask *src1, const struct cpumask *src2)
^
static
>> kernel/bpf/cpumask.c:184:6: warning: no previous prototype for function 'bpf_cpumask_empty' [-Wmissing-prototypes]
bool bpf_cpumask_empty(const struct cpumask *cpumask)
^
kernel/bpf/cpumask.c:184:1: note: declare 'static' if the function is not intended to be used outside of this translation unit
bool bpf_cpumask_empty(const struct cpumask *cpumask)
^
static
kernel/bpf/cpumask.c:189:6: warning: no previous prototype for function 'bpf_cpumask_full' [-Wmissing-prototypes]
bool bpf_cpumask_full(const struct cpumask *cpumask)
^
kernel/bpf/cpumask.c:189:1: note: declare 'static' if the function is not intended to be used outside of this translation unit
bool bpf_cpumask_full(const struct cpumask *cpumask)
^
static
kernel/bpf/cpumask.c:194:6: warning: no previous prototype for function 'bpf_cpumask_copy' [-Wmissing-prototypes]
void bpf_cpumask_copy(struct bpf_cpumask *dst, const struct cpumask *src)
^
kernel/bpf/cpumask.c:194:1: note: declare 'static' if the function is not intended to be used outside of this translation unit
void bpf_cpumask_copy(struct bpf_cpumask *dst, const struct cpumask *src)
^
static
kernel/bpf/cpumask.c:199:5: warning: no previous prototype for function 'bpf_cpumask_any' [-Wmissing-prototypes]
u32 bpf_cpumask_any(const struct cpumask *cpumask)
^
kernel/bpf/cpumask.c:199:1: note: declare 'static' if the function is not intended to be used outside of this translation unit
u32 bpf_cpumask_any(const struct cpumask *cpumask)
^
static
kernel/bpf/cpumask.c:204:5: warning: no previous prototype for function 'bpf_cpumask_any_and' [-Wmissing-prototypes]
u32 bpf_cpumask_any_and(const struct cpumask *src1, const struct cpumask *src2)
^
kernel/bpf/cpumask.c:204:1: note: declare 'static' if the function is not intended to be used outside of this translation unit
u32 bpf_cpumask_any_and(const struct cpumask *src1, const struct cpumask *src2)
^
static
24 warnings generated.
vim +/bpf_cpumask_create +38 kernel/bpf/cpumask.c
37
> 38 struct bpf_cpumask *bpf_cpumask_create(void)
39 {
40 struct bpf_cpumask *cpumask;
41
42 cpumask = bpf_mem_alloc(&bpf_cpumask_ma, sizeof(*cpumask));
43 if (!cpumask)
44 return NULL;
45
46 memset(cpumask, 0, sizeof(*cpumask));
47 refcount_set(&cpumask->usage, 1);
48
49 return cpumask;
50 }
51
> 52 struct bpf_cpumask *bpf_cpumask_acquire(struct bpf_cpumask *cpumask)
53 {
54 refcount_inc(&cpumask->usage);
55 return cpumask;
56 }
57
> 58 struct bpf_cpumask *bpf_cpumask_kptr_get(struct bpf_cpumask **cpumaskp)
59 {
60 struct bpf_cpumask *cpumask;
61
62 /* The BPF memory allocator frees memory backing its caches in an RCU
63 * callback. Thus, we can safely use RCU to ensure that the cpumask is
64 * safe to read.
65 */
66 rcu_read_lock();
67
68 cpumask = READ_ONCE(*cpumaskp);
69 if (cpumask && !refcount_inc_not_zero(&cpumask->usage))
70 cpumask = NULL;
71
72 rcu_read_unlock();
73 return cpumask;
74 }
75
> 76 void bpf_cpumask_release(struct bpf_cpumask *cpumask)
77 {
78 if (!cpumask)
79 return;
80
81 if (refcount_dec_and_test(&cpumask->usage)) {
82 migrate_disable();
83 bpf_mem_free(&bpf_cpumask_ma, cpumask);
84 migrate_enable();
85 }
86 }
87
> 88 u32 bpf_cpumask_first(const struct cpumask *cpumask)
89 {
90 return cpumask_first(cpumask);
91 }
92
> 93 u32 bpf_cpumask_first_zero(const struct cpumask *cpumask)
94 {
95 return cpumask_first_zero(cpumask);
96 }
97
> 98 void bpf_cpumask_set_cpu(u32 cpu, struct bpf_cpumask *cpumask)
99 {
100 if (!cpu_valid(cpu))
101 return;
102
103 cpumask_set_cpu(cpu, (struct cpumask *)cpumask);
104 }
105
> 106 void bpf_cpumask_clear_cpu(u32 cpu, struct bpf_cpumask *cpumask)
107 {
108 if (!cpu_valid(cpu))
109 return;
110
111 cpumask_clear_cpu(cpu, (struct cpumask *)cpumask);
112 }
113
> 114 bool bpf_cpumask_test_cpu(u32 cpu, const struct cpumask *cpumask)
115 {
116 if (!cpu_valid(cpu))
117 return false;
118
119 return cpumask_test_cpu(cpu, (struct cpumask *)cpumask);
120 }
121
> 122 bool bpf_cpumask_test_and_set_cpu(u32 cpu, struct bpf_cpumask *cpumask)
123 {
124 if (!cpu_valid(cpu))
125 return false;
126
127 return cpumask_test_and_set_cpu(cpu, (struct cpumask *)cpumask);
128 }
129
> 130 bool bpf_cpumask_test_and_clear_cpu(u32 cpu, struct bpf_cpumask *cpumask)
131 {
132 if (!cpu_valid(cpu))
133 return false;
134
135 return cpumask_test_and_clear_cpu(cpu, (struct cpumask *)cpumask);
136 }
137
> 138 void bpf_cpumask_setall(struct bpf_cpumask *cpumask)
139 {
140 cpumask_setall((struct cpumask *)cpumask);
141 }
142
> 143 void bpf_cpumask_clear(struct bpf_cpumask *cpumask)
144 {
145 cpumask_clear((struct cpumask *)cpumask);
146 }
147
> 148 bool bpf_cpumask_and(struct bpf_cpumask *dst,
149 const struct cpumask *src1,
150 const struct cpumask *src2)
151 {
152 return cpumask_and((struct cpumask *)dst, src1, src2);
153 }
154
> 155 void bpf_cpumask_or(struct bpf_cpumask *dst,
156 const struct cpumask *src1,
157 const struct cpumask *src2)
158 {
159 cpumask_or((struct cpumask *)dst, src1, src2);
160 }
161
> 162 void bpf_cpumask_xor(struct bpf_cpumask *dst,
163 const struct cpumask *src1,
164 const struct cpumask *src2)
165 {
166 cpumask_xor((struct cpumask *)dst, src1, src2);
167 }
168
> 169 bool bpf_cpumask_equal(const struct cpumask *src1, const struct cpumask *src2)
170 {
171 return cpumask_equal(src1, src2);
172 }
173
> 174 bool bpf_cpumask_intersects(const struct cpumask *src1, const struct cpumask *src2)
175 {
176 return cpumask_intersects(src1, src2);
177 }
178
> 179 bool bpf_cpumask_subset(const struct cpumask *src1, const struct cpumask *src2)
180 {
181 return cpumask_subset(src1, src2);
182 }
183
> 184 bool bpf_cpumask_empty(const struct cpumask *cpumask)
185 {
186 return cpumask_empty(cpumask);
187 }
188
> 189 bool bpf_cpumask_full(const struct cpumask *cpumask)
190 {
191 return cpumask_full(cpumask);
192 }
193
> 194 void bpf_cpumask_copy(struct bpf_cpumask *dst, const struct cpumask *src)
195 {
196 cpumask_copy((struct cpumask *)dst, src);
197 }
198
> 199 u32 bpf_cpumask_any(const struct cpumask *cpumask)
200 {
201 return cpumask_any(cpumask);
202 }
203
> 204 u32 bpf_cpumask_any_and(const struct cpumask *src1, const struct cpumask *src2)
205 {
206 return cpumask_any_and(src1, src2);
207 }
208
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests
Hi David,
Thank you for the patch! Perhaps something to improve:
[auto build test WARNING on bpf-next/master]
url: https://github.com/intel-lab-lkp/linux/commits/David-Vernet/bpf-Enable-annotating-trusted-nested-pointers/20230120-080139
base: https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git master
patch link: https://lore.kernel.org/r/20230119235833.2948341-2-void%40manifault.com
patch subject: [PATCH bpf-next 1/8] bpf: Enable annotating trusted nested pointers
config: x86_64-rhel-8.3-rust (https://download.01.org/0day-ci/archive/20230120/[email protected]/config)
compiler: clang version 14.0.6 (https://github.com/llvm/llvm-project f28c006a5895fc0e329fe15fead81e37457cb1d1)
reproduce (this is a W=1 build):
wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
chmod +x ~/bin/make.cross
# https://github.com/intel-lab-lkp/linux/commit/8f6df14342b1be3516f8e21037edf771df851427
git remote add linux-review https://github.com/intel-lab-lkp/linux
git fetch --no-tags linux-review David-Vernet/bpf-Enable-annotating-trusted-nested-pointers/20230120-080139
git checkout 8f6df14342b1be3516f8e21037edf771df851427
# save the config file
mkdir build_dir && cp config build_dir/.config
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross W=1 O=build_dir ARCH=x86_64 olddefconfig
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross W=1 O=build_dir ARCH=x86_64 SHELL=/bin/bash kernel/bpf/
If you fix the issue, kindly add following tag where applicable
| Reported-by: kernel test robot <[email protected]>
All warnings (new ones prefixed by >>):
>> kernel/bpf/btf.c:533:5: warning: no previous prototype for function 'bpf_find_btf_id' [-Wmissing-prototypes]
s32 bpf_find_btf_id(const char *name, u32 kind, struct btf **btf_p)
^
kernel/bpf/btf.c:533:1: note: declare 'static' if the function is not intended to be used outside of this translation unit
s32 bpf_find_btf_id(const char *name, u32 kind, struct btf **btf_p)
^
static
1 warning generated.
vim +/bpf_find_btf_id +533 kernel/bpf/btf.c
532
> 533 s32 bpf_find_btf_id(const char *name, u32 kind, struct btf **btf_p)
534 {
535 struct btf *btf;
536 s32 ret;
537 int id;
538
539 btf = bpf_get_btf_vmlinux();
540 if (IS_ERR(btf))
541 return PTR_ERR(btf);
542 if (!btf)
543 return -EINVAL;
544
545 ret = btf_find_by_name_kind(btf, name, kind);
546 /* ret is never zero, since btf_find_by_name_kind returns
547 * positive btf_id or negative error.
548 */
549 if (ret > 0) {
550 btf_get(btf);
551 *btf_p = btf;
552 return ret;
553 }
554
555 /* If name is not found in vmlinux's BTF then search in module's BTFs */
556 spin_lock_bh(&btf_idr_lock);
557 idr_for_each_entry(&btf_idr, btf, id) {
558 if (!btf_is_module(btf))
559 continue;
560 /* linear search could be slow hence unlock/lock
561 * the IDR to avoiding holding it for too long
562 */
563 btf_get(btf);
564 spin_unlock_bh(&btf_idr_lock);
565 ret = btf_find_by_name_kind(btf, name, kind);
566 if (ret > 0) {
567 *btf_p = btf;
568 return ret;
569 }
570 spin_lock_bh(&btf_idr_lock);
571 btf_put(btf);
572 }
573 spin_unlock_bh(&btf_idr_lock);
574 return ret;
575 }
576
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests
On Thu, Jan 19, 2023 at 05:58:32PM -0600, David Vernet wrote:
> +
> +For example:
> +
> +.. code-block:: c
> +
> + /**
> + * A trivial example tracepoint program that shows how to
> + * acquire and release a struct bpf_cpumask *.
> + */
> + SEC("tp_btf/task_newtask")
> + int BPF_PROG(task_acquire_release_example, struct task_struct *task, u64 clone_flags)
> + {
> + struct bpf_cpumask *cpumask, *acquired;
> +
> + cpumask = bpf_cpumask_create();
> + if (!cpumask)
> + return 1;
> +
> + acquired = bpf_cpumask_acquire(cpumask);
> + bpf_cpumask_release(cpumask);
> + bpf_cpumask_acquire(acquired);
> +
> + return 0;
> + }
As the first example in the doc it was... alarming :)
I've read it as it says that bpf_cpumask_acquire has to be called on
freshly created cpumask before it can be used.
I've started to doubt by code reading skills of the previous patches :)
A basic example is probably necessary to introduce the concept.
Or this example should have bpf_cpumask_set_cpu right after create and
more alu ops after release with comments to demonstrate the point.
On Thu, Jan 19, 2023 at 10:14:41PM -0800, Alexei Starovoitov wrote:
> On Fri, Jan 20, 2023 at 11:26:37AM +0530, Kumar Kartikeya Dwivedi wrote:
> > On Fri, 20 Jan 2023 at 11:10, Alexei Starovoitov
> > <[email protected]> wrote:
> > >
> > > On Thu, Jan 19, 2023 at 11:23:18PM -0600, David Vernet wrote:
> > > > On Fri, Jan 20, 2023 at 10:28:15AM +0530, Kumar Kartikeya Dwivedi wrote:
> > > > > On Fri, Jan 20, 2023 at 05:28:27AM IST, David Vernet wrote:
> > > > > > When validating BTF types for KF_TRUSTED_ARGS kfuncs, the verifier
> > > > > > currently enforces that the top-level type must match when calling
> > > > > > the kfunc. In other words, the verifier does not allow the BPF program
> > > > > > to pass a bitwise equivalent struct, despite it being functionally safe.
> > > > > > For example, if you have the following type:
> > > > > >
> > > > > > struct nf_conn___init {
> > > > > > struct nf_conn ct;
> > > > > > };
> > > > > >
> > > > > > It would be safe to pass a struct nf_conn___init to a kfunc expecting a
> > > > > > struct nf_conn.
> > > > >
> > > > > Just running bpf_nf selftest would have shown this is false.
> > > >
> > > > And I feel silly, because I did run them, and could have sworn they
> > > > passed...looking now at the change_status_after_alloc testcase I see
> > > > you're of course correct. Very poor example, thank you for pointing it
> > > > out.
> > > >
> > > > >
> > > > > > Being able to do this will be useful for certain types
> > > > > > of kfunc / kptrs enabled by BPF. For example, in a follow-on patch, a
> > > > > > series of kfuncs will be added which allow programs to do bitwise
> > > > > > queries on cpumasks that are either allocated by the program (in which
> > > > > > case they'll be a 'struct bpf_cpumask' type that wraps a cpumask_t as
> > > > > > its first element), or a cpumask that was allocated by the main kernel
> > > > > > (in which case it will just be a straight cpumask_t, as in
> > > > > > task->cpus_ptr).
> > > > > >
> > > > > > Having the two types of cpumasks allows us to distinguish between the
> > > > > > two for when a cpumask is read-only vs. mutatable. A struct bpf_cpumask
> > > > > > can be mutated by e.g. bpf_cpumask_clear(), whereas a regular cpumask_t
> > > > > > cannot be. On the other hand, a struct bpf_cpumask can of course be
> > > > > > queried in the exact same manner as a cpumask_t, with e.g.
> > > > > > bpf_cpumask_test_cpu().
> > > > > >
> > > > > > If we were to enforce that top level types match, then a user that's
> > > > > > passing a struct bpf_cpumask to a read-only cpumask_t argument would
> > > > > > have to cast with something like bpf_cast_to_kern_ctx() (which itself
> > > > > > would need to be updated to expect the alias, and currently it only
> > > > > > accommodates a single alias per prog type). Additionally, not specifying
> > > > > > KF_TRUSTED_ARGS is not an option, as some kfuncs take one argument as a
> > > > > > struct bpf_cpumask *, and another as a struct cpumask *
> > > > > > (i.e. cpumask_t).
> > > > > >
> > > > > > In order to enable this, this patch relaxes the constraint that a
> > > > > > KF_TRUSTED_ARGS kfunc must have strict type matching. In order to
> > > > > > try and be conservative and match existing behavior / expectations, this
> > > > > > patch also enforces strict type checking for acquire kfuncs. We were
> > > > > > already enforcing it for release kfuncs, so this should also improve the
> > > > > > consistency of the semantics for kfuncs.
> > > > > >
> > > > >
> > > > > What you want is to simply follow type at off = 0 (but still enforce the off = 0
> > > > > requirement). This is something which is currently done for bpf_sk_release (for
> > > > > struct sk_common) in check_reg_type, but it is not safe in general to just open
> > > > > this up for all cases. I suggest encoding this particular requirement in the
> > > > > argument, and simply using triple underscore variant of the type for the special
> > > > > 'read_only' requirement. This will allow you to use same type in your BPF C
> > > > > program, while allowing verifier to see them as two different types in kfunc
> > > > > parameters. Then just relax type following for the particular argument so that
> > > > > one can pass cpumask_t___ro to kfunc expecting cpumask_t (but only at off = 0,
> > > > > it just visits first member after failing match on top level type). off = 0
> > > > > check is still necessary.
> > > >
> > > > Sigh, yeah, another ___ workaround but I agree it's probably the best we
> > > > can do for now, and in general seems pretty useful. Obviously preferable
> > > > to this patch which just doesn't work. Alexei, are you OK with this? If
> > > > so, I'll take this approach for v2.
> > >
> > > We decided to rely on strict type match when we introduced 'struct nf_conn___init',
> > > but with that we twisted the C standard to, what looks to be, a wrong direction.
> > >
> > > For definition:
> > > struct nf_conn___init {
> > > struct nf_conn ct;
> > > };
> > > if a kfunc accepts a pointer to nf_conn it should always accept a pointer to nf_conn__init
> > > for both read and write, because in C that's valid and safe type cast.
> > >
> >
> > The intention of this nf_conn___init was to be invisible to the user.
> > In selftests there is no trace of nf_conn___init. It is only for
> > enforcing semantics by virtue of type safety in the verifier.
> >
> > Allocated but not inserted nf_conn -> nf_conn___init
> > Inserted/looked up nf_conn -> nf_conn
> >
> > We can't pass e.g. nf_conn___init * to a function expecting nf_conn *.
> > The allocated nf_conn may not yet be fully initialized. It is only
> > after bpf_ct_insert_entry takes the nf_conn___init * and returns
> > inserted nf_conn * should it be allowed.
>
> Yes. I know and agree with all of the above.
>
> > But for the user in BPF C it will be the same nf_conn. The verifier
> > can enforce different semantics on the underlying type's usage in
> > kfuncs etc, while the user performs normal direct access to the
> > nf_conn.
> >
> > It will be the same case here, except you also introduce the case of
> > kfuncs that are 'polymorphic' and can take both. Relaxing
> > 'strict_type_match' for that arg and placing the type of member you
> > wish to convert the pointer to gives you such polymorphism. But it's
> > not correct to do for nf_conn___init to nf_conn, at least not by
> > default.
>
> Yes. Agree. I used unfortunate example in the previous reply with nf_conn___init.
> I meant to say:
>
> For definition:
> struct nf_conn_init {
> struct nf_conn ct;
> };
> if a kfunc accepts a pointer to nf_conn it should always accept a pointer to nf_conn_init
> for both read and write, because in C that's valid and safe type cast.
>
> Meainng that C rules apply.
> Our triple underscore is special, because it's the "same type".
> In the 2nd part of my reply I'm proposing to use the whole suffix "___init" to indicate that.
> I think you're arguing that just "___" part is enough to enforce strict match.
> Matching foo___flavor with foo should not be allowed.
> While passing struct foo_flavor {struct foo;} into a kfunc that accepts 'struct foo'
> is safe.
> If so, I'm fine with such approach.
Alright, I'll spin v2 to treat any type with name___.* as a disallowed
alias, and update the documentation to mention it. I was originally
going to push back and say that we should just use a single alias like
__nocast to keep things simple, but it doesn't feel generalizable
enough.
On Fri, Jan 20, 2023 at 08:56:55AM -0600, David Vernet wrote:
> On Thu, Jan 19, 2023 at 10:14:41PM -0800, Alexei Starovoitov wrote:
> > On Fri, Jan 20, 2023 at 11:26:37AM +0530, Kumar Kartikeya Dwivedi wrote:
> > > On Fri, 20 Jan 2023 at 11:10, Alexei Starovoitov
> > > <[email protected]> wrote:
> > > >
> > > > On Thu, Jan 19, 2023 at 11:23:18PM -0600, David Vernet wrote:
> > > > > On Fri, Jan 20, 2023 at 10:28:15AM +0530, Kumar Kartikeya Dwivedi wrote:
> > > > > > On Fri, Jan 20, 2023 at 05:28:27AM IST, David Vernet wrote:
> > > > > > > When validating BTF types for KF_TRUSTED_ARGS kfuncs, the verifier
> > > > > > > currently enforces that the top-level type must match when calling
> > > > > > > the kfunc. In other words, the verifier does not allow the BPF program
> > > > > > > to pass a bitwise equivalent struct, despite it being functionally safe.
> > > > > > > For example, if you have the following type:
> > > > > > >
> > > > > > > struct nf_conn___init {
> > > > > > > struct nf_conn ct;
> > > > > > > };
> > > > > > >
> > > > > > > It would be safe to pass a struct nf_conn___init to a kfunc expecting a
> > > > > > > struct nf_conn.
> > > > > >
> > > > > > Just running bpf_nf selftest would have shown this is false.
> > > > >
> > > > > And I feel silly, because I did run them, and could have sworn they
> > > > > passed...looking now at the change_status_after_alloc testcase I see
> > > > > you're of course correct. Very poor example, thank you for pointing it
> > > > > out.
> > > > >
> > > > > >
> > > > > > > Being able to do this will be useful for certain types
> > > > > > > of kfunc / kptrs enabled by BPF. For example, in a follow-on patch, a
> > > > > > > series of kfuncs will be added which allow programs to do bitwise
> > > > > > > queries on cpumasks that are either allocated by the program (in which
> > > > > > > case they'll be a 'struct bpf_cpumask' type that wraps a cpumask_t as
> > > > > > > its first element), or a cpumask that was allocated by the main kernel
> > > > > > > (in which case it will just be a straight cpumask_t, as in
> > > > > > > task->cpus_ptr).
> > > > > > >
> > > > > > > Having the two types of cpumasks allows us to distinguish between the
> > > > > > > two for when a cpumask is read-only vs. mutatable. A struct bpf_cpumask
> > > > > > > can be mutated by e.g. bpf_cpumask_clear(), whereas a regular cpumask_t
> > > > > > > cannot be. On the other hand, a struct bpf_cpumask can of course be
> > > > > > > queried in the exact same manner as a cpumask_t, with e.g.
> > > > > > > bpf_cpumask_test_cpu().
> > > > > > >
> > > > > > > If we were to enforce that top level types match, then a user that's
> > > > > > > passing a struct bpf_cpumask to a read-only cpumask_t argument would
> > > > > > > have to cast with something like bpf_cast_to_kern_ctx() (which itself
> > > > > > > would need to be updated to expect the alias, and currently it only
> > > > > > > accommodates a single alias per prog type). Additionally, not specifying
> > > > > > > KF_TRUSTED_ARGS is not an option, as some kfuncs take one argument as a
> > > > > > > struct bpf_cpumask *, and another as a struct cpumask *
> > > > > > > (i.e. cpumask_t).
> > > > > > >
> > > > > > > In order to enable this, this patch relaxes the constraint that a
> > > > > > > KF_TRUSTED_ARGS kfunc must have strict type matching. In order to
> > > > > > > try and be conservative and match existing behavior / expectations, this
> > > > > > > patch also enforces strict type checking for acquire kfuncs. We were
> > > > > > > already enforcing it for release kfuncs, so this should also improve the
> > > > > > > consistency of the semantics for kfuncs.
> > > > > > >
> > > > > >
> > > > > > What you want is to simply follow type at off = 0 (but still enforce the off = 0
> > > > > > requirement). This is something which is currently done for bpf_sk_release (for
> > > > > > struct sk_common) in check_reg_type, but it is not safe in general to just open
> > > > > > this up for all cases. I suggest encoding this particular requirement in the
> > > > > > argument, and simply using triple underscore variant of the type for the special
> > > > > > 'read_only' requirement. This will allow you to use same type in your BPF C
> > > > > > program, while allowing verifier to see them as two different types in kfunc
> > > > > > parameters. Then just relax type following for the particular argument so that
> > > > > > one can pass cpumask_t___ro to kfunc expecting cpumask_t (but only at off = 0,
> > > > > > it just visits first member after failing match on top level type). off = 0
> > > > > > check is still necessary.
> > > > >
> > > > > Sigh, yeah, another ___ workaround but I agree it's probably the best we
> > > > > can do for now, and in general seems pretty useful. Obviously preferable
> > > > > to this patch which just doesn't work. Alexei, are you OK with this? If
> > > > > so, I'll take this approach for v2.
> > > >
> > > > We decided to rely on strict type match when we introduced 'struct nf_conn___init',
> > > > but with that we twisted the C standard to, what looks to be, a wrong direction.
> > > >
> > > > For definition:
> > > > struct nf_conn___init {
> > > > struct nf_conn ct;
> > > > };
> > > > if a kfunc accepts a pointer to nf_conn it should always accept a pointer to nf_conn__init
> > > > for both read and write, because in C that's valid and safe type cast.
> > > >
> > >
> > > The intention of this nf_conn___init was to be invisible to the user.
> > > In selftests there is no trace of nf_conn___init. It is only for
> > > enforcing semantics by virtue of type safety in the verifier.
> > >
> > > Allocated but not inserted nf_conn -> nf_conn___init
> > > Inserted/looked up nf_conn -> nf_conn
> > >
> > > We can't pass e.g. nf_conn___init * to a function expecting nf_conn *.
> > > The allocated nf_conn may not yet be fully initialized. It is only
> > > after bpf_ct_insert_entry takes the nf_conn___init * and returns
> > > inserted nf_conn * should it be allowed.
> >
> > Yes. I know and agree with all of the above.
> >
> > > But for the user in BPF C it will be the same nf_conn. The verifier
> > > can enforce different semantics on the underlying type's usage in
> > > kfuncs etc, while the user performs normal direct access to the
> > > nf_conn.
> > >
> > > It will be the same case here, except you also introduce the case of
> > > kfuncs that are 'polymorphic' and can take both. Relaxing
> > > 'strict_type_match' for that arg and placing the type of member you
> > > wish to convert the pointer to gives you such polymorphism. But it's
> > > not correct to do for nf_conn___init to nf_conn, at least not by
> > > default.
> >
> > Yes. Agree. I used unfortunate example in the previous reply with nf_conn___init.
> > I meant to say:
> >
> > For definition:
> > struct nf_conn_init {
> > struct nf_conn ct;
> > };
> > if a kfunc accepts a pointer to nf_conn it should always accept a pointer to nf_conn_init
> > for both read and write, because in C that's valid and safe type cast.
> >
> > Meainng that C rules apply.
> > Our triple underscore is special, because it's the "same type".
> > In the 2nd part of my reply I'm proposing to use the whole suffix "___init" to indicate that.
> > I think you're arguing that just "___" part is enough to enforce strict match.
> > Matching foo___flavor with foo should not be allowed.
> > While passing struct foo_flavor {struct foo;} into a kfunc that accepts 'struct foo'
> > is safe.
> > If so, I'm fine with such approach.
>
> Alright, I'll spin v2 to treat any type with name___.* as a disallowed
> alias, and update the documentation to mention it. I was originally
> going to push back and say that we should just use a single alias like
> __nocast to keep things simple, but it doesn't feel generalizable
> enough.
On second thought, unless you guys feel strongly, I'll just check
___init. The resulting code is going to be a lot of tricky string
manipulation / math otherwise. Not _terrible_, but I'd prefer to avoid
adding it until we have a concrete use-case. And I expect this could be
implemented much simpler using something like tags, once gcc has support
for it.
On Fri, Jan 20, 2023 at 7:26 AM David Vernet <[email protected]> wrote:
> > >
> > > Yes. Agree. I used unfortunate example in the previous reply with nf_conn___init.
> > > I meant to say:
> > >
> > > For definition:
> > > struct nf_conn_init {
> > > struct nf_conn ct;
> > > };
> > > if a kfunc accepts a pointer to nf_conn it should always accept a pointer to nf_conn_init
> > > for both read and write, because in C that's valid and safe type cast.
> > >
> > > Meainng that C rules apply.
> > > Our triple underscore is special, because it's the "same type".
> > > In the 2nd part of my reply I'm proposing to use the whole suffix "___init" to indicate that.
> > > I think you're arguing that just "___" part is enough to enforce strict match.
> > > Matching foo___flavor with foo should not be allowed.
> > > While passing struct foo_flavor {struct foo;} into a kfunc that accepts 'struct foo'
> > > is safe.
> > > If so, I'm fine with such approach.
> >
> > Alright, I'll spin v2 to treat any type with name___.* as a disallowed
> > alias, and update the documentation to mention it. I was originally
> > going to push back and say that we should just use a single alias like
> > __nocast to keep things simple, but it doesn't feel generalizable
> > enough.
>
> On second thought, unless you guys feel strongly, I'll just check
> ___init. The resulting code is going to be a lot of tricky string
> manipulation / math otherwise. Not _terrible_, but I'd prefer to avoid
> adding it until we have a concrete use-case. And I expect this could be
> implemented much simpler using something like tags, once gcc has support
> for it.
There is bpf_core_is_flavor_sep() that makes it easy to check,
but thinking more about it we probably should stick to strict "___init"
suffix for now, since flavors can appear in progs too and they
are equivalent to corresponding types in the kernel.
The nf_conn___init is kernel only type.
The verifier sees that bpf_xdp_ct_alloc kfunc returns it,
but when we export this kfunc to bpf prog it returns nf_conn.
We probably should pick some other suffix without "___" to distinguish
from flavors. bpf prog should not use nf_conn___init.