LinuxLists.cc - [PATCH v2 bpf-next 0/2] inline asm helpers to access array elements

2024-01-03 18:54:17

Subject: [PATCH v2 bpf-next 0/2] inline asm helpers to access array elements

Sorry for the delay on this. Discussed in [1]. It's a helper for
proving to the verifier that your access in the array is valid. Happy
to change names or whatever. =)

Also added a libbpf helper function for mmapping an mmappable map.

We've been using both in our ghost-BPF schedulers[2].

[1] https://lore.kernel.org/bpf/[email protected]/T/
[2] https://github.com/google/ghost-userspace/blob/main/third_party/bpf/common.bpf.h#L218

v1: https://lore.kernel.org/bpf/[email protected]/
- use libbpf's internal bpf_map_mmap_sz()
- remove debugging cruft

Barret Rhoden (2):
libbpf: add helpers for mmapping maps
selftests/bpf: add inline assembly helpers to access array elements

tools/bpf/bpftool/gen.c | 16 +-
tools/lib/bpf/libbpf.c | 18 ++
tools/lib/bpf/libbpf.h | 6 +
tools/lib/bpf/libbpf.map | 4 +
.../bpf/prog_tests/test_array_elem.c | 112 ++++++++++
.../selftests/bpf/progs/array_elem_test.c | 195 ++++++++++++++++++
tools/testing/selftests/bpf/progs/bpf_misc.h | 43 ++++
7 files changed, 381 insertions(+), 13 deletions(-)
create mode 100644 tools/testing/selftests/bpf/prog_tests/test_array_elem.c
create mode 100644 tools/testing/selftests/bpf/progs/array_elem_test.c

--
2.43.0.472.g3155946c3a-goog

2024-01-03 18:56:04

by Barret Rhoden

[permalink] [raw]

Subject: [PATCH v2 bpf-next 1/2] libbpf: add helpers for mmapping maps

bpf_map__mmap_size() was internal to bpftool. Use that to make wrappers
for mmap and munmap.

Signed-off-by: Barret Rhoden <[email protected]>
---
tools/bpf/bpftool/gen.c | 16 +++-------------
tools/lib/bpf/libbpf.c | 18 ++++++++++++++++++
tools/lib/bpf/libbpf.h | 6 ++++++
tools/lib/bpf/libbpf.map | 4 ++++
4 files changed, 31 insertions(+), 13 deletions(-)

diff --git a/tools/bpf/bpftool/gen.c b/tools/bpf/bpftool/gen.c
index ee3ce2b8000d..a328e960c141 100644
--- a/tools/bpf/bpftool/gen.c
+++ b/tools/bpf/bpftool/gen.c
@@ -453,16 +453,6 @@ static void print_hex(const char *data, int data_sz)
}
}

-static size_t bpf_map_mmap_sz(const struct bpf_map *map)
-{
- long page_sz = sysconf(_SC_PAGE_SIZE);
- size_t map_sz;
-
- map_sz = (size_t)roundup(bpf_map__value_size(map), 8) * bpf_map__max_entries(map);
- map_sz = roundup(map_sz, page_sz);
- return map_sz;
-}
-
/* Emit type size asserts for all top-level fields in memory-mapped internal maps. */
static void codegen_asserts(struct bpf_object *obj, const char *obj_name)
{
@@ -641,7 +631,7 @@ static void codegen_destroy(struct bpf_object *obj, const char *obj_name)
if (bpf_map__is_internal(map) &&
(bpf_map__map_flags(map) & BPF_F_MMAPABLE))
printf("\tskel_free_map_data(skel->%1$s, skel->maps.%1$s.initial_value, %2$zd);\n",
- ident, bpf_map_mmap_sz(map));
+ ident, bpf_map__mmap_size(map));
codegen("\
\n\
skel_closenz(skel->maps.%1$s.map_fd); \n\
@@ -723,7 +713,7 @@ static int gen_trace(struct bpf_object *obj, const char *obj_name, const char *h
goto cleanup; \n\
skel->maps.%1$s.initial_value = (__u64) (long) skel->%1$s;\n\
} \n\
- ", ident, bpf_map_mmap_sz(map));
+ ", ident, bpf_map__mmap_size(map));
}
codegen("\
\n\
@@ -780,7 +770,7 @@ static int gen_trace(struct bpf_object *obj, const char *obj_name, const char *h
if (!skel->%1$s) \n\
return -ENOMEM; \n\
",
- ident, bpf_map_mmap_sz(map), mmap_flags);
+ ident, bpf_map__mmap_size(map), mmap_flags);
}
codegen("\
\n\
diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
index ebcfb2147fbd..2dae6fabc0d1 100644
--- a/tools/lib/bpf/libbpf.c
+++ b/tools/lib/bpf/libbpf.c
@@ -9830,6 +9830,24 @@ void *bpf_map__initial_value(struct bpf_map *map, size_t *psize)
return map->mmaped;
}

+size_t bpf_map__mmap_size(const struct bpf_map *map)
+{
+ return bpf_map_mmap_sz(bpf_map__value_size(map),
+ bpf_map__max_entries(map));
+}
+
+void *bpf_map__mmap(const struct bpf_map *map)
+{
+ return mmap(NULL, bpf_map__mmap_size(map),
+ PROT_READ | PROT_WRITE, MAP_SHARED,
+ bpf_map__fd(map), 0);
+}
+
+int bpf_map__munmap(const struct bpf_map *map, void *addr)
+{
+ return munmap(addr, bpf_map__mmap_size(map));
+}
+
bool bpf_map__is_internal(const struct bpf_map *map)
{
return map->libbpf_type != LIBBPF_MAP_UNSPEC;
diff --git a/tools/lib/bpf/libbpf.h b/tools/lib/bpf/libbpf.h
index 6cd9c501624f..148f4c783ca7 100644
--- a/tools/lib/bpf/libbpf.h
+++ b/tools/lib/bpf/libbpf.h
@@ -996,6 +996,12 @@ LIBBPF_API int bpf_map__set_map_extra(struct bpf_map *map, __u64 map_extra);
LIBBPF_API int bpf_map__set_initial_value(struct bpf_map *map,
const void *data, size_t size);
LIBBPF_API void *bpf_map__initial_value(struct bpf_map *map, size_t *psize);
+/* get the mmappable size of the map */
+LIBBPF_API size_t bpf_map__mmap_size(const struct bpf_map *map);
+/* mmap the map */
+LIBBPF_API void *bpf_map__mmap(const struct bpf_map *map);
+/* munmap the map at addr */
+LIBBPF_API int bpf_map__munmap(const struct bpf_map *map, void *addr);

/**
* @brief **bpf_map__is_internal()** tells the caller whether or not the
diff --git a/tools/lib/bpf/libbpf.map b/tools/lib/bpf/libbpf.map
index 91c5aef7dae7..9e44de4fbf39 100644
--- a/tools/lib/bpf/libbpf.map
+++ b/tools/lib/bpf/libbpf.map
@@ -411,4 +411,8 @@ LIBBPF_1.3.0 {
} LIBBPF_1.2.0;

LIBBPF_1.4.0 {
+ global:
+ bpf_map__mmap_size;
+ bpf_map__mmap;
+ bpf_map__munmap;
} LIBBPF_1.3.0;
--
2.43.0.472.g3155946c3a-goog

2024-01-03 18:56:13

by Barret Rhoden

[permalink] [raw]

Subject: [PATCH v2 bpf-next 2/2] selftests/bpf: add inline assembly helpers to access array elements

When accessing an array, even if you insert your own bounds check,
sometimes the compiler will remove the check, or modify it such that the
verifier no longer knows your access is within bounds.

The compiler is even free to make a copy of a register, check the copy,
and use the original to access the array. The verifier knows the *copy*
is within bounds, but not the original register!

Signed-off-by: Barret Rhoden <[email protected]>
---
.../bpf/prog_tests/test_array_elem.c | 112 ++++++++++
.../selftests/bpf/progs/array_elem_test.c | 195 ++++++++++++++++++
tools/testing/selftests/bpf/progs/bpf_misc.h | 43 ++++
3 files changed, 350 insertions(+)
create mode 100644 tools/testing/selftests/bpf/prog_tests/test_array_elem.c
create mode 100644 tools/testing/selftests/bpf/progs/array_elem_test.c

diff --git a/tools/testing/selftests/bpf/prog_tests/test_array_elem.c b/tools/testing/selftests/bpf/prog_tests/test_array_elem.c
new file mode 100644
index 000000000000..c953636f07c9
--- /dev/null
+++ b/tools/testing/selftests/bpf/prog_tests/test_array_elem.c
@@ -0,0 +1,112 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright (c) 2024 Google LLC. */
+#include <test_progs.h>
+#include "array_elem_test.skel.h"
+
+#define NR_MAP_ELEMS 100
+
+/*
+ * Helper to load and run a program.
+ * Call must define skel, map_elems, and bss_elems.
+ * Destroy the skel when you're done.
+ */
+#define load_and_run(PROG) ({ \
+ int err; \
+ skel = array_elem_test__open(); \
+ if (!ASSERT_OK_PTR(skel, "array_elem_test open")) \
+ return; \
+ bpf_program__set_autoload(skel->progs.x_ ## PROG, true); \
+ err = array_elem_test__load(skel); \
+ if (!ASSERT_EQ(err, 0, "array_elem_test load")) { \
+ array_elem_test__destroy(skel); \
+ return; \
+ } \
+ err = array_elem_test__attach(skel); \
+ if (!ASSERT_EQ(err, 0, "array_elem_test attach")) { \
+ array_elem_test__destroy(skel); \
+ return; \
+ } \
+ for (int i = 0; i < NR_MAP_ELEMS; i++) \
+ skel->bss->lookup_indexes[i] = i; \
+ map_elems = bpf_map__mmap(skel->maps.arraymap); \
+ ASSERT_OK_PTR(map_elems, "mmap"); \
+ bss_elems = skel->bss->bss_elems; \
+ skel->bss->target_pid = getpid(); \
+ usleep(1); \
+})
+
+static void test_access_all(void)
+{
+ struct array_elem_test *skel;
+ int *map_elems;
+ int *bss_elems;
+
+ load_and_run(access_all);
+
+ for (int i = 0; i < NR_MAP_ELEMS; i++)
+ ASSERT_EQ(map_elems[i], i, "array_elem map value not written");
+
+ for (int i = 0; i < NR_MAP_ELEMS; i++)
+ ASSERT_EQ(bss_elems[i], i, "array_elem bss value not written");
+
+ array_elem_test__destroy(skel);
+}
+
+static void test_oob_access(void)
+{
+ struct array_elem_test *skel;
+ int *map_elems;
+ int *bss_elems;
+
+ load_and_run(oob_access);
+
+ for (int i = 0; i < NR_MAP_ELEMS; i++)
+ ASSERT_EQ(map_elems[i], 0, "array_elem map value was written");
+
+ for (int i = 0; i < NR_MAP_ELEMS; i++)
+ ASSERT_EQ(bss_elems[i], 0, "array_elem bss value was written");
+
+ array_elem_test__destroy(skel);
+}
+
+static void test_access_array_map_infer_sz(void)
+{
+ struct array_elem_test *skel;
+ int *map_elems;
+ int *bss_elems __maybe_unused;
+
+ load_and_run(access_array_map_infer_sz);
+
+ for (int i = 0; i < NR_MAP_ELEMS; i++)
+ ASSERT_EQ(map_elems[i], i, "array_elem map value not written");
+
+ array_elem_test__destroy(skel);
+}
+
+
+/* Test that attempting to load a bad program fails. */
+#define test_bad(PROG) ({ \
+ struct array_elem_test *skel; \
+ int err; \
+ skel = array_elem_test__open(); \
+ if (!ASSERT_OK_PTR(skel, "array_elem_test open")) \
+ return; \
+ bpf_program__set_autoload(skel->progs.x_bad_ ## PROG, true); \
+ err = array_elem_test__load(skel); \
+ ASSERT_ERR(err, "array_elem_test load " # PROG); \
+ array_elem_test__destroy(skel); \
+})
+
+void test_test_array_elem(void)
+{
+ if (test__start_subtest("array_elem_access_all"))
+ test_access_all();
+ if (test__start_subtest("array_elem_oob_access"))
+ test_oob_access();
+ if (test__start_subtest("array_elem_access_array_map_infer_sz"))
+ test_access_array_map_infer_sz();
+ if (test__start_subtest("array_elem_bad_map_array_access"))
+ test_bad(map_array_access);
+ if (test__start_subtest("array_elem_bad_bss_array_access"))
+ test_bad(bss_array_access);
+}
diff --git a/tools/testing/selftests/bpf/progs/array_elem_test.c b/tools/testing/selftests/bpf/progs/array_elem_test.c
new file mode 100644
index 000000000000..9d48afc933f0
--- /dev/null
+++ b/tools/testing/selftests/bpf/progs/array_elem_test.c
@@ -0,0 +1,195 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright (c) 2024 Google LLC. */
+#include <stdbool.h>
+#include <linux/types.h>
+#include <linux/bpf.h>
+#include <bpf/bpf_helpers.h>
+#include <bpf/bpf_tracing.h>
+#include "bpf_misc.h"
+
+char _license[] SEC("license") = "GPL";
+
+int target_pid = 0;
+
+#define NR_MAP_ELEMS 100
+
+/*
+ * We want to test valid accesses into an array, but we also need to fool the
+ * verifier. If we just do for (i = 0; i < 100; i++), the verifier knows the
+ * value of i and can tell we're inside the array.
+ *
+ * This "lookup" array is just the values 0, 1, 2..., such that
+ * lookup_indexes[i] == i. (set by userspace). But the verifier doesn't know
+ * that.
+ */
+unsigned int lookup_indexes[NR_MAP_ELEMS];
+
+/* Arrays can be in the BSS or inside a map element. Make sure both work. */
+int bss_elems[NR_MAP_ELEMS];
+
+struct map_array {
+ int elems[NR_MAP_ELEMS];
+};
+
+/*
+ * This is an ARRAY_MAP of a single struct, and that struct is an array of
+ * elements. Userspace can mmap the map as if it was just a basic array of
+ * elements. Though if you make an ARRAY_MAP where the *values* are ints, don't
+ * forget that bpf map elements are rounded up to 8 bytes.
+ *
+ * Once you get the pointer to the base of the inner array, you can access all
+ * of the elements without another bpf_map_lookup_elem(), which is useful if you
+ * are operating on multiple elements while holding a spinlock.
+ */
+struct {
+ __uint(type, BPF_MAP_TYPE_ARRAY);
+ __uint(max_entries, 1);
+ __type(key, int);
+ __type(value, struct map_array);
+ __uint(map_flags, BPF_F_MMAPABLE);
+} arraymap SEC(".maps");
+
+static struct map_array *get_map_array(void)
+{
+ int zero = 0;
+
+ return bpf_map_lookup_elem(&arraymap, &zero);
+}
+
+static int *get_map_elems(void)
+{
+ struct map_array *arr = get_map_array();
+
+ if (!arr)
+ return NULL;
+ return arr->elems;
+}
+
+/*
+ * Test that we can access all elements, and that we are accessing the element
+ * we think we are accessing.
+ */
+static void access_all(void)
+{
+ int *map_elems = get_map_elems();
+ int *x;
+
+ for (int i = 0; i < NR_MAP_ELEMS; i++) {
+ x = bpf_array_elem(map_elems, NR_MAP_ELEMS, lookup_indexes[i]);
+ if (x)
+ *x = i;
+ }
+
+ for (int i = 0; i < NR_MAP_ELEMS; i++) {
+ x = bpf_array_sz_elem(bss_elems, lookup_indexes[i]);
+ if (x)
+ *x = i;
+ }
+}
+
+SEC("?tp/syscalls/sys_enter_nanosleep")
+int x_access_all(void *ctx)
+{
+ if ((bpf_get_current_pid_tgid() >> 32) != target_pid)
+ return 0;
+ access_all();
+ return 0;
+}
+
+/*
+ * Helper for various OOB tests. An out-of-bound access should be handled like
+ * a lookup failure. Specifically, the verifier should ensure we do not access
+ * outside the array. Userspace will check that we didn't access somewhere
+ * inside the array.
+ */
+static void set_elem_to_1(long idx)
+{
+ int *map_elems = get_map_elems();
+ int *x;
+
+ x = bpf_array_elem(map_elems, NR_MAP_ELEMS, idx);
+ if (x)
+ *x = 1;
+ x = bpf_array_sz_elem(bss_elems, idx);
+ if (x)
+ *x = 1;
+}
+
+/*
+ * Test various out-of-bounds accesses.
+ */
+static void oob_access(void)
+{
+ set_elem_to_1(NR_MAP_ELEMS + 5);
+ set_elem_to_1(NR_MAP_ELEMS);
+ set_elem_to_1(-1);
+ set_elem_to_1(~0UL);
+}
+
+SEC("?tp/syscalls/sys_enter_nanosleep")
+int x_oob_access(void *ctx)
+{
+ if ((bpf_get_current_pid_tgid() >> 32) != target_pid)
+ return 0;
+ oob_access();
+ return 0;
+}
+
+/*
+ * Test that we can use the ARRAY_SIZE-style helper with an array in a map.
+ *
+ * Note that you cannot infer the size of the array from just a pointer; you
+ * have to use the actual elems[100]. i.e. this will fail and should fail to
+ * compile (-Wsizeof-pointer-div):
+ *
+ * int *map_elems = get_map_elems();
+ * x = bpf_array_sz_elem(map_elems, lookup_indexes[i]);
+ */
+static void access_array_map_infer_sz(void)
+{
+ struct map_array *arr = get_map_array();
+ int *x;
+
+ for (int i = 0; i < NR_MAP_ELEMS; i++) {
+ x = bpf_array_sz_elem(arr->elems, lookup_indexes[i]);
+ if (x)
+ *x = i;
+ }
+}
+
+SEC("?tp/syscalls/sys_enter_nanosleep")
+int x_access_array_map_infer_sz(void *ctx)
+{
+ if ((bpf_get_current_pid_tgid() >> 32) != target_pid)
+ return 0;
+ access_array_map_infer_sz();
+ return 0;
+}
+
+
+
+SEC("?tp/syscalls/sys_enter_nanosleep")
+int x_bad_map_array_access(void *ctx)
+{
+ int *map_elems = get_map_elems();
+
+ /*
+ * Need to check to promote map_elems from MAP_OR_NULL to MAP so that we
+ * fail to load below for the right reason.
+ */
+ if (!map_elems)
+ return 0;
+ /* Fail to load: we don't prove our access is inside map_elems[] */
+ for (int i = 0; i < NR_MAP_ELEMS; i++)
+ map_elems[lookup_indexes[i]] = i;
+ return 0;
+}
+
+SEC("?tp/syscalls/sys_enter_nanosleep")
+int x_bad_bss_array_access(void *ctx)
+{
+ /* Fail to load: we don't prove our access is inside bss_elems[] */
+ for (int i = 0; i < NR_MAP_ELEMS; i++)
+ bss_elems[lookup_indexes[i]] = i;
+ return 0;
+}
diff --git a/tools/testing/selftests/bpf/progs/bpf_misc.h b/tools/testing/selftests/bpf/progs/bpf_misc.h
index 2fd59970c43a..002bab44cde2 100644
--- a/tools/testing/selftests/bpf/progs/bpf_misc.h
+++ b/tools/testing/selftests/bpf/progs/bpf_misc.h
@@ -135,4 +135,47 @@
/* make it look to compiler like value is read and written */
#define __sink(expr) asm volatile("" : "+g"(expr))

+/*
+ * Access an array element within a bound, such that the verifier knows the
+ * access is safe.
+ *
+ * This macro asm is the equivalent of:
+ *
+ * if (!arr)
+ * return NULL;
+ * if (idx >= arr_sz)
+ * return NULL;
+ * return &arr[idx];
+ *
+ * The index (___idx below) needs to be a u64, at least for certain versions of
+ * the BPF ISA, since there aren't u32 conditional jumps.
+ */
+#define bpf_array_elem(arr, arr_sz, idx) ({ \
+ typeof(&(arr)[0]) ___arr = arr; \
+ __u64 ___idx = idx; \
+ if (___arr) { \
+ asm volatile("if %[__idx] >= %[__bound] goto 1f; \
+ %[__idx] *= %[__size]; \
+ %[__arr] += %[__idx]; \
+ goto 2f; \
+ 1:; \
+ %[__arr] = 0; \
+ 2: \
+ " \
+ : [__arr]"+r"(___arr), [__idx]"+r"(___idx) \
+ : [__bound]"r"((arr_sz)), \
+ [__size]"i"(sizeof(typeof((arr)[0]))) \
+ : "cc"); \
+ } \
+ ___arr; \
+})
+
+/*
+ * Convenience wrapper for bpf_array_elem(), where we compute the size of the
+ * array. Be sure to use an actual array, and not a pointer, just like with the
+ * ARRAY_SIZE macro.
+ */
+#define bpf_array_sz_elem(arr, idx) \
+ bpf_array_elem(arr, sizeof(arr) / sizeof((arr)[0]), idx)
+
#endif
--
2.43.0.472.g3155946c3a-goog

2024-01-03 19:42:28

by Andrii Nakryiko

[permalink] [raw]

Subject: Re: [PATCH v2 bpf-next 1/2] libbpf: add helpers for mmapping maps

On Wed, Jan 3, 2024 at 10:54 AM Barret Rhoden <[email protected]> wrote:
>
> bpf_map__mmap_size() was internal to bpftool. Use that to make wrappers
> for mmap and munmap.
>
> Signed-off-by: Barret Rhoden <[email protected]>
> ---
> tools/bpf/bpftool/gen.c | 16 +++-------------
> tools/lib/bpf/libbpf.c | 18 ++++++++++++++++++
> tools/lib/bpf/libbpf.h | 6 ++++++
> tools/lib/bpf/libbpf.map | 4 ++++
> 4 files changed, 31 insertions(+), 13 deletions(-)
>

There is very little added value provided by these APIs, while adding
API obligations. bpf_map__mmap() assumes READ/WRITE access, why? What
if the user needs only read-only? And so on.

Please drop this patch, we don't need to expose these functions as
stable APIs. Definitely not bpf_map__mmap/munmap. bpf_map__mmap_size()
might be useful to hide various per-map type details, but we'll need
to discuss all this separately.

> diff --git a/tools/bpf/bpftool/gen.c b/tools/bpf/bpftool/gen.c
> index ee3ce2b8000d..a328e960c141 100644
> --- a/tools/bpf/bpftool/gen.c
> +++ b/tools/bpf/bpftool/gen.c
> @@ -453,16 +453,6 @@ static void print_hex(const char *data, int data_sz)
> }
> }
>
> -static size_t bpf_map_mmap_sz(const struct bpf_map *map)
> -{
> - long page_sz = sysconf(_SC_PAGE_SIZE);
> - size_t map_sz;
> -
> - map_sz = (size_t)roundup(bpf_map__value_size(map), 8) * bpf_map__max_entries(map);

this is specifically ARRAY map's rules, it might differ for other map
types (e.g., RINGBUF has different logic)

> - map_sz = roundup(map_sz, page_sz);
> - return map_sz;
> -}
> -

[...]

2024-01-03 19:45:37

by Barret Rhoden

[permalink] [raw]

Subject: Re: [PATCH v2 bpf-next 1/2] libbpf: add helpers for mmapping maps

On 1/3/24 14:42, Andrii Nakryiko wrote:
> this is specifically ARRAY map's rules, it might differ for other map
> types (e.g., RINGBUF has different logic)

good to know. will drop it.

2024-01-03 20:00:48

by Andrii Nakryiko

[permalink] [raw]

Subject: Re: [PATCH v2 bpf-next 1/2] libbpf: add helpers for mmapping maps

On Wed, Jan 3, 2024 at 11:45 AM Barret Rhoden <[email protected]> wrote:
>
> On 1/3/24 14:42, Andrii Nakryiko wrote:
> > this is specifically ARRAY map's rules, it might differ for other map
> > types (e.g., RINGBUF has different logic)
>
> good to know. will drop it.

please give other people a bit more time to review your current
revision, don't rapid-fire new revisions immediately after getting
feedback

2024-01-04 13:43:22

by Jiri Olsa

[permalink] [raw]

Subject: Re: [PATCH v2 bpf-next 2/2] selftests/bpf: add inline assembly helpers to access array elements

On Wed, Jan 03, 2024 at 01:53:59PM -0500, Barret Rhoden wrote:

SNIP

> +
> +
> +/* Test that attempting to load a bad program fails. */
> +#define test_bad(PROG) ({ \
> + struct array_elem_test *skel; \
> + int err; \
> + skel = array_elem_test__open(); \
> + if (!ASSERT_OK_PTR(skel, "array_elem_test open")) \
> + return; \
> + bpf_program__set_autoload(skel->progs.x_bad_ ## PROG, true); \
> + err = array_elem_test__load(skel); \
> + ASSERT_ERR(err, "array_elem_test load " # PROG); \
> + array_elem_test__destroy(skel); \
> +})

I wonder we could use the existing RUN_TESTS macro and use tags
in programs like we do for example in progs/test_global_func1.c:

SEC("tc")
__failure __msg("combined stack size of 4 calls is 544")
int global_func1(struct __sk_buff *skb)

jirka

> +
> +void test_test_array_elem(void)
> +{
> + if (test__start_subtest("array_elem_access_all"))
> + test_access_all();
> + if (test__start_subtest("array_elem_oob_access"))
> + test_oob_access();
> + if (test__start_subtest("array_elem_access_array_map_infer_sz"))
> + test_access_array_map_infer_sz();
> + if (test__start_subtest("array_elem_bad_map_array_access"))
> + test_bad(map_array_access);
> + if (test__start_subtest("array_elem_bad_bss_array_access"))
> + test_bad(bss_array_access);
> +}
> diff --git a/tools/testing/selftests/bpf/progs/array_elem_test.c b/tools/testing/selftests/bpf/progs/array_elem_test.c
> new file mode 100644
> index 000000000000..9d48afc933f0
> --- /dev/null
> +++ b/tools/testing/selftests/bpf/progs/array_elem_test.c
> @@ -0,0 +1,195 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/* Copyright (c) 2024 Google LLC. */
> +#include <stdbool.h>
> +#include <linux/types.h>
> +#include <linux/bpf.h>
> +#include <bpf/bpf_helpers.h>
> +#include <bpf/bpf_tracing.h>
> +#include "bpf_misc.h"
> +
> +char _license[] SEC("license") = "GPL";
> +
> +int target_pid = 0;
> +
> +#define NR_MAP_ELEMS 100
> +
> +/*
> + * We want to test valid accesses into an array, but we also need to fool the
> + * verifier. If we just do for (i = 0; i < 100; i++), the verifier knows the
> + * value of i and can tell we're inside the array.
> + *
> + * This "lookup" array is just the values 0, 1, 2..., such that
> + * lookup_indexes[i] == i. (set by userspace). But the verifier doesn't know
> + * that.
> + */
> +unsigned int lookup_indexes[NR_MAP_ELEMS];
> +
> +/* Arrays can be in the BSS or inside a map element. Make sure both work. */
> +int bss_elems[NR_MAP_ELEMS];
> +
> +struct map_array {
> + int elems[NR_MAP_ELEMS];
> +};
> +
> +/*
> + * This is an ARRAY_MAP of a single struct, and that struct is an array of
> + * elements. Userspace can mmap the map as if it was just a basic array of
> + * elements. Though if you make an ARRAY_MAP where the *values* are ints, don't
> + * forget that bpf map elements are rounded up to 8 bytes.
> + *
> + * Once you get the pointer to the base of the inner array, you can access all
> + * of the elements without another bpf_map_lookup_elem(), which is useful if you
> + * are operating on multiple elements while holding a spinlock.
> + */
> +struct {
> + __uint(type, BPF_MAP_TYPE_ARRAY);
> + __uint(max_entries, 1);
> + __type(key, int);
> + __type(value, struct map_array);
> + __uint(map_flags, BPF_F_MMAPABLE);
> +} arraymap SEC(".maps");
> +
> +static struct map_array *get_map_array(void)
> +{
> + int zero = 0;
> +
> + return bpf_map_lookup_elem(&arraymap, &zero);
> +}
> +
> +static int *get_map_elems(void)
> +{
> + struct map_array *arr = get_map_array();
> +
> + if (!arr)
> + return NULL;
> + return arr->elems;
> +}
> +
> +/*
> + * Test that we can access all elements, and that we are accessing the element
> + * we think we are accessing.
> + */
> +static void access_all(void)
> +{
> + int *map_elems = get_map_elems();
> + int *x;
> +
> + for (int i = 0; i < NR_MAP_ELEMS; i++) {
> + x = bpf_array_elem(map_elems, NR_MAP_ELEMS, lookup_indexes[i]);
> + if (x)
> + *x = i;
> + }
> +
> + for (int i = 0; i < NR_MAP_ELEMS; i++) {
> + x = bpf_array_sz_elem(bss_elems, lookup_indexes[i]);
> + if (x)
> + *x = i;
> + }
> +}
> +
> +SEC("?tp/syscalls/sys_enter_nanosleep")
> +int x_access_all(void *ctx)
> +{
> + if ((bpf_get_current_pid_tgid() >> 32) != target_pid)
> + return 0;
> + access_all();
> + return 0;
> +}
> +
> +/*
> + * Helper for various OOB tests. An out-of-bound access should be handled like
> + * a lookup failure. Specifically, the verifier should ensure we do not access
> + * outside the array. Userspace will check that we didn't access somewhere
> + * inside the array.
> + */
> +static void set_elem_to_1(long idx)
> +{
> + int *map_elems = get_map_elems();
> + int *x;
> +
> + x = bpf_array_elem(map_elems, NR_MAP_ELEMS, idx);
> + if (x)
> + *x = 1;
> + x = bpf_array_sz_elem(bss_elems, idx);
> + if (x)
> + *x = 1;
> +}
> +
> +/*
> + * Test various out-of-bounds accesses.
> + */
> +static void oob_access(void)
> +{
> + set_elem_to_1(NR_MAP_ELEMS + 5);
> + set_elem_to_1(NR_MAP_ELEMS);
> + set_elem_to_1(-1);
> + set_elem_to_1(~0UL);
> +}
> +
> +SEC("?tp/syscalls/sys_enter_nanosleep")
> +int x_oob_access(void *ctx)
> +{
> + if ((bpf_get_current_pid_tgid() >> 32) != target_pid)
> + return 0;
> + oob_access();
> + return 0;
> +}
> +
> +/*
> + * Test that we can use the ARRAY_SIZE-style helper with an array in a map.
> + *
> + * Note that you cannot infer the size of the array from just a pointer; you
> + * have to use the actual elems[100]. i.e. this will fail and should fail to
> + * compile (-Wsizeof-pointer-div):
> + *
> + * int *map_elems = get_map_elems();
> + * x = bpf_array_sz_elem(map_elems, lookup_indexes[i]);
> + */
> +static void access_array_map_infer_sz(void)
> +{
> + struct map_array *arr = get_map_array();
> + int *x;
> +
> + for (int i = 0; i < NR_MAP_ELEMS; i++) {
> + x = bpf_array_sz_elem(arr->elems, lookup_indexes[i]);
> + if (x)
> + *x = i;
> + }
> +}
> +
> +SEC("?tp/syscalls/sys_enter_nanosleep")
> +int x_access_array_map_infer_sz(void *ctx)
> +{
> + if ((bpf_get_current_pid_tgid() >> 32) != target_pid)
> + return 0;
> + access_array_map_infer_sz();
> + return 0;
> +}
> +
> +
> +
> +SEC("?tp/syscalls/sys_enter_nanosleep")
> +int x_bad_map_array_access(void *ctx)
> +{
> + int *map_elems = get_map_elems();
> +
> + /*
> + * Need to check to promote map_elems from MAP_OR_NULL to MAP so that we
> + * fail to load below for the right reason.
> + */
> + if (!map_elems)
> + return 0;
> + /* Fail to load: we don't prove our access is inside map_elems[] */
> + for (int i = 0; i < NR_MAP_ELEMS; i++)
> + map_elems[lookup_indexes[i]] = i;
> + return 0;
> +}
> +
> +SEC("?tp/syscalls/sys_enter_nanosleep")
> +int x_bad_bss_array_access(void *ctx)
> +{
> + /* Fail to load: we don't prove our access is inside bss_elems[] */
> + for (int i = 0; i < NR_MAP_ELEMS; i++)
> + bss_elems[lookup_indexes[i]] = i;
> + return 0;
> +}
> diff --git a/tools/testing/selftests/bpf/progs/bpf_misc.h b/tools/testing/selftests/bpf/progs/bpf_misc.h
> index 2fd59970c43a..002bab44cde2 100644
> --- a/tools/testing/selftests/bpf/progs/bpf_misc.h
> +++ b/tools/testing/selftests/bpf/progs/bpf_misc.h
> @@ -135,4 +135,47 @@
> /* make it look to compiler like value is read and written */
> #define __sink(expr) asm volatile("" : "+g"(expr))
>
> +/*
> + * Access an array element within a bound, such that the verifier knows the
> + * access is safe.
> + *
> + * This macro asm is the equivalent of:
> + *
> + * if (!arr)
> + * return NULL;
> + * if (idx >= arr_sz)
> + * return NULL;
> + * return &arr[idx];
> + *
> + * The index (___idx below) needs to be a u64, at least for certain versions of
> + * the BPF ISA, since there aren't u32 conditional jumps.
> + */
> +#define bpf_array_elem(arr, arr_sz, idx) ({ \
> + typeof(&(arr)[0]) ___arr = arr; \
> + __u64 ___idx = idx; \
> + if (___arr) { \
> + asm volatile("if %[__idx] >= %[__bound] goto 1f; \
> + %[__idx] *= %[__size]; \
> + %[__arr] += %[__idx]; \
> + goto 2f; \
> + 1:; \
> + %[__arr] = 0; \
> + 2: \
> + " \
> + : [__arr]"+r"(___arr), [__idx]"+r"(___idx) \
> + : [__bound]"r"((arr_sz)), \
> + [__size]"i"(sizeof(typeof((arr)[0]))) \
> + : "cc"); \
> + } \
> + ___arr; \
> +})
> +
> +/*
> + * Convenience wrapper for bpf_array_elem(), where we compute the size of the
> + * array. Be sure to use an actual array, and not a pointer, just like with the
> + * ARRAY_SIZE macro.
> + */
> +#define bpf_array_sz_elem(arr, idx) \
> + bpf_array_elem(arr, sizeof(arr) / sizeof((arr)[0]), idx)
> +
> #endif
> --
> 2.43.0.472.g3155946c3a-goog
>
>

2024-01-04 17:33:31

by Yonghong Song

[permalink] [raw]

Subject: Re: [PATCH v2 bpf-next 2/2] selftests/bpf: add inline assembly helpers to access array elements

cc Eduard.

On 1/4/24 5:43 AM, Jiri Olsa wrote:
> On Wed, Jan 03, 2024 at 01:53:59PM -0500, Barret Rhoden wrote:
>
> SNIP
>
>> +
>> +
>> +/* Test that attempting to load a bad program fails. */
>> +#define test_bad(PROG) ({ \
>> + struct array_elem_test *skel; \
>> + int err; \
>> + skel = array_elem_test__open(); \
>> + if (!ASSERT_OK_PTR(skel, "array_elem_test open")) \
>> + return; \
>> + bpf_program__set_autoload(skel->progs.x_bad_ ## PROG, true); \
>> + err = array_elem_test__load(skel); \
>> + ASSERT_ERR(err, "array_elem_test load " # PROG); \
>> + array_elem_test__destroy(skel); \
>> +})
> I wonder we could use the existing RUN_TESTS macro and use tags
> in programs like we do for example in progs/test_global_func1.c:
>
> SEC("tc")
> __failure __msg("combined stack size of 4 calls is 544")
> int global_func1(struct __sk_buff *skb)
>
> jirka
>
>
>> +
>> +void test_test_array_elem(void)
>> +{
>> + if (test__start_subtest("array_elem_access_all"))
>> + test_access_all();
>> + if (test__start_subtest("array_elem_oob_access"))
>> + test_oob_access();
>> + if (test__start_subtest("array_elem_access_array_map_infer_sz"))
>> + test_access_array_map_infer_sz();
>> + if (test__start_subtest("array_elem_bad_map_array_access"))
>> + test_bad(map_array_access);
>> + if (test__start_subtest("array_elem_bad_bss_array_access"))
>> + test_bad(bss_array_access);
>> +
[...]
>> diff --git a/tools/testing/selftests/bpf/progs/bpf_misc.h b/tools/testing/selftests/bpf/progs/bpf_misc.h
>> index 2fd59970c43a..002bab44cde2 100644
>> --- a/tools/testing/selftests/bpf/progs/bpf_misc.h
>> +++ b/tools/testing/selftests/bpf/progs/bpf_misc.h
>> @@ -135,4 +135,47 @@
>> /* make it look to compiler like value is read and written */
>> #define __sink(expr) asm volatile("" : "+g"(expr))
>>
>> +/*
>> + * Access an array element within a bound, such that the verifier knows the
>> + * access is safe.
>> + *
>> + * This macro asm is the equivalent of:
>> + *
>> + * if (!arr)
>> + * return NULL;
>> + * if (idx >= arr_sz)
>> + * return NULL;
>> + * return &arr[idx];
>> + *
>> + * The index (___idx below) needs to be a u64, at least for certain versions of
>> + * the BPF ISA, since there aren't u32 conditional jumps.
>> + */
>> +#define bpf_array_elem(arr, arr_sz, idx) ({ \
>> + typeof(&(arr)[0]) ___arr = arr; \
>> + __u64 ___idx = idx; \
>> + if (___arr) { \
>> + asm volatile("if %[__idx] >= %[__bound] goto 1f; \
>> + %[__idx] *= %[__size]; \
>> + %[__arr] += %[__idx]; \
>> + goto 2f; \
>> + 1:; \
>> + %[__arr] = 0; \
>> + 2: \
>> + " \
>> + : [__arr]"+r"(___arr), [__idx]"+r"(___idx) \
>> + : [__bound]"r"((arr_sz)), \
>> + [__size]"i"(sizeof(typeof((arr)[0]))) \
>> + : "cc"); \
>> + } \
>> + ___arr; \
>> +})

The LLVM bpf backend has made some improvement to handle the case like
r1 = ...
r2 = r1 + 1
if (r2 < num) ...
using r1
by preventing generating the above code pattern.

The implementation is a pattern matching style so surely it won't be
able to cover all cases.

Do you have specific examples which has verification failure due to
false array out of bound access?

>> +
>> +/*
>> + * Convenience wrapper for bpf_array_elem(), where we compute the size of the
>> + * array. Be sure to use an actual array, and not a pointer, just like with the
>> + * ARRAY_SIZE macro.
>> + */
>> +#define bpf_array_sz_elem(arr, idx) \
>> + bpf_array_elem(arr, sizeof(arr) / sizeof((arr)[0]), idx)
>> +
>> #endif
>> --
>> 2.43.0.472.g3155946c3a-goog
>>
>>

2024-01-04 21:30:24

by Barret Rhoden

[permalink] [raw]

Subject: Re: [PATCH v2 bpf-next 2/2] selftests/bpf: add inline assembly helpers to access array elements

On 1/4/24 12:31, Yonghong Song wrote:
[snip]

>>> +/*
>>> + * Access an array element within a bound, such that the verifier
>>> knows the
>>> + * access is safe.
>>> + *
>>> + * This macro asm is the equivalent of:
>>> + *
>>> + *    if (!arr)
>>> + *        return NULL;
>>> + *    if (idx >= arr_sz)
>>> + *        return NULL;
>>> + *    return &arr[idx];
>>> + *
>>> + * The index (___idx below) needs to be a u64, at least for certain
>>> versions of
>>> + * the BPF ISA, since there aren't u32 conditional jumps.
>>> + */
>>> +#define bpf_array_elem(arr, arr_sz, idx) ({                \
>>> +    typeof(&(arr)[0]) ___arr = arr;                    \
>>> +    __u64 ___idx = idx;                        \
>>> +    if (___arr) {                            \
>>> +        asm volatile("if %[__idx] >= %[__bound] goto 1f;    \
>>> +                  %[__idx] *= %[__size];        \
>>> +                  %[__arr] += %[__idx];        \
>>> +                  goto 2f;                \
>>> +                  1:;                \
>>> +                  %[__arr] = 0;            \
>>> +                  2:                \
>>> +                  "                        \
>>> +                 : [__arr]"+r"(___arr), [__idx]"+r"(___idx)    \
>>> +                 : [__bound]"r"((arr_sz)),                \
>>> +                   [__size]"i"(sizeof(typeof((arr)[0])))    \
>>> +                 : "cc");                    \
>>> +    }                                \
>>> +    ___arr;                                \
>>> +})
>
> The LLVM bpf backend has made some improvement to handle the case like
> r1 = ...
> r2 = r1 + 1
> if (r2 < num) ...
> using r1
> by preventing generating the above code pattern.
>
> The implementation is a pattern matching style so surely it won't be
> able to cover all cases.
>
> Do you have specific examples which has verification failure due to
> false array out of bound access?

Not in a small example. =(

This bug has an example, but it was part of a larger program:
https://github.com/google/ghost-userspace/issues/31

The rough progression was:
- sometimes the compiler optimizes out the checks. So we added a macro
to make the compiler not know the value of the variable anymore.
- then, the compiler would occasionally do the check on a copy of the
register, so we did the comparison and index operation all in assembly.

I tried using bpf_cmp_likely() in my actual program (not just a one-off
test), and still had a verifier issue. It's a large and convoluted
program, so it might be hard to get a small reproducer. But it a
different compiler issue than the one you mentioned.

Specifically, I swapped out my array-access-macro for this one, using
bpf_cmp_likely():

#define bpf_array_elem(arr, arr_sz, idx) ({ \
typeof(&(arr)[0]) ___arr = arr; \
typeof(&(arr)[0]) ___ret = 0; \
u64 ___idx = idx; \
if (___arr && bpf_cmp_likely(___idx, <, arr_sz)) \
___ret = &___arr[___idx];\
___ret; \
})

which should be the same logic as before:

* if (!arr)
* return NULL;
* if (idx >= arr_sz)
* return NULL;
* return &arr[idx];

The issue I run into is different than the one you had. The compiler
did the bounds check, but then for some reason recreated the index. The
index is coming from another map.

Arguably, the verifier is doing its job - that value could have changed.
I just don't want the compiler to do the reread or any other
shenanigans in between the bounds check and the usage.

The guts of the error:
- r0 is the map (L127)
- r1 is the index, read from another map (L128)
- r1 gets verified to be less than 0x200 (L129)
- some other stuff happens
- r1 gets read again, and is no longer bound (L132)
- r1 gets scaled up by 896.
(896*0x200 = 0x70000, would be the real bound, but r1 lost the 0x200
bound)
- r0 indexed by the bad r1 (L134)
- blow up (L143)

127: (15) if r0 == 0x0 goto pc+1218 ;
R0=map_value(off=0,ks=4,vs=458752,imm=0)

128: (79) r1 = *(u64 *)(r10 -40) ;
R1_w=Pscalar(umax=4294967295,var_off=(0x0; 0xffffffff)) R10=fp0

129: (35) if r1 >= 0x200 goto pc+1216 ;
R1_w=Pscalar(umax=511,var_off=(0x0; 0x1ff))

130: (79) r4 = *(u64 *)(r10 -56) ; R4_w=Pscalar() R10=fp0;

131: (37) r4 /= 1000 ; R4_w=Pscalar()

132: (79) r1 = *(u64 *)(r10 -40) ;
R1_w=Pscalar(umax=4294967295,var_off=(0x0; 0xffffffff)) R10=fp0;

133: (27) r1 *= 896 ;
R1_w=Pscalar(umax=3848290696320,var_off=(0x0;
0x3ffffffff80),s32_max=2147483520,u32_max=-128)

134: (0f) r0 += r1 ;
R0_w=map_value(off=0,ks=4,vs=458752,umax=3848290696320,var_off=(0x0;
0x3ffffffff80),s32_max=2147483520,u32_max=-128)
R1_w=Pscalar(umax=3848290696320,var_off=(0x0;
0x3ffffffff80),s32_max=2147483520,u32_max=-128)

135: (79) r3 = *(u64 *)(r10 -48) ;
R3_w=map_value(off=0,ks=4,vs=15728640,imm=0) R10=fp0;

136: (0f) r3 += r8 ;
R3_w=map_value(off=0,ks=4,vs=15728640,umax=15728400,var_off=(0x0;
0xfffff0),s32_max=16777200,u32_max=16777200)
R8=Pscalar(umax=15728400,var_off=(0x0; 0xfffff0))

137: (61) r1 = *(u32 *)(r7 +16) ;
R1_w=Pscalar(umax=4294967295,var_off=(0x0; 0xffffffff))
R7=map_value(id=18779,off=0,ks=4,vs=224,imm=0)

138: (79) r2 = *(u64 *)(r3 +88) ; R2=Pscalar()
R3=map_value(off=0,ks=4,vs=15728640,umax=15728400,var_off=(0x0;
0xfffff0),s32_max=16777200,u32_max=16777200)

139: (a5) if r1 < 0x9 goto pc+1 ;
R1=Pscalar(umin=9,umax=4294967295,var_off=(0x0; 0xffffffff))

140: (b7) r1 = 0 ; R1_w=P0

141: (27) r1 *= 72 ; R1_w=P0

142: (0f) r0 += r1 ;
R0_w=map_value(off=0,ks=4,vs=458752,umax=3848290696320,var_off=(0x0;
0x3ffffffff80),s32_max=2147483520,u32_max=-128) R1_w=P0

143: (7b) *(u64 *)(r0 +152) = r2

if i put in a little ASM magic to tell the compiler to not recreate the
index, it works, like so:

#define BPF_MUST_CHECK(x) ({ asm volatile ("" : "+r"(x)); x; })

#define bpf_array_elem(arr, arr_sz, idx) ({ \
typeof(&(arr)[0]) ___arr = arr; \
typeof(&(arr)[0]) ___ret = 0; \
u64 ___idx = idx; \
BPF_MUST_CHECK(___idx); \
if (___arr && bpf_cmp_likely(___idx, <, arr_sz)) \
___ret = &___arr[___idx];\
___ret; \
})

though anecdotally, that only stops the "reread the index from its map"
problem, similar to a READ_ONCE. the compiler is still free to just use
another register for the check.

The bit of ASM i had from a while back that did that was:

* r2 = r8
* r2 <<= 32

* r2 >>= 32
* if r2 > 0x3ff goto pc+29

* r8 <<= 32

* r8 >>= 32

* r8 <<= 6

* r0 += r8
* *(u64 *)(r0 +48) = r3

where r2 was bounds checked, but r8 was used instead.

I'll play around and see if I can come up with a selftest that can run
into any of these "you did the check, but threw the check away" scenarios.

Thanks,

Barret

2024-01-10 00:26:45

by Barret Rhoden

[permalink] [raw]

Subject: Re: [PATCH v2 bpf-next 2/2] selftests/bpf: add inline assembly helpers to access array elements

On 1/4/24 08:43, Jiri Olsa wrote:
> I wonder we could use the existing RUN_TESTS macro and use tags
> in programs like we do for example in progs/test_global_func1.c:
>
> SEC("tc")
> __failure __msg("combined stack size of 4 calls is 544")
> int global_func1(struct __sk_buff *skb)

This worked, thanks.

The style of test I have right now is that each test is a separate
program, with all programs in the same skeleton. RUN_TESTS attempted to
load the __failure programs, with the side-effect of loading all of the
non-failures too.

Thanks,

Barret

2024-01-10 00:46:06

by Alexei Starovoitov

[permalink] [raw]

Subject: Re: [PATCH v2 bpf-next 2/2] selftests/bpf: add inline assembly helpers to access array elements

On Thu, Jan 4, 2024 at 1:30 PM Barret Rhoden <[email protected]> wrote:
>
> On 1/4/24 12:31, Yonghong Song wrote:
> [snip]
>
> >>> +/*
> >>> + * Access an array element within a bound, such that the verifier
> >>> knows the
> >>> + * access is safe.
> >>> + *
> >>> + * This macro asm is the equivalent of:
> >>> + *
> >>> + * if (!arr)
> >>> + * return NULL;
> >>> + * if (idx >= arr_sz)
> >>> + * return NULL;
> >>> + * return &arr[idx];
> >>> + *
> >>> + * The index (___idx below) needs to be a u64, at least for certain
> >>> versions of
> >>> + * the BPF ISA, since there aren't u32 conditional jumps.
> >>> + */
> >>> +#define bpf_array_elem(arr, arr_sz, idx) ({ \
> >>> + typeof(&(arr)[0]) ___arr = arr; \
> >>> + __u64 ___idx = idx; \
> >>> + if (___arr) { \
> >>> + asm volatile("if %[__idx] >= %[__bound] goto 1f; \
> >>> + %[__idx] *= %[__size]; \
> >>> + %[__arr] += %[__idx]; \
> >>> + goto 2f; \
> >>> + 1:; \
> >>> + %[__arr] = 0; \
> >>> + 2: \
> >>> + " \
> >>> + : [__arr]"+r"(___arr), [__idx]"+r"(___idx) \
> >>> + : [__bound]"r"((arr_sz)), \
> >>> + [__size]"i"(sizeof(typeof((arr)[0]))) \
> >>> + : "cc"); \
> >>> + } \
> >>> + ___arr; \
> >>> +})
> >
> > The LLVM bpf backend has made some improvement to handle the case like
> > r1 = ...
> > r2 = r1 + 1
> > if (r2 < num) ...
> > using r1
> > by preventing generating the above code pattern.
> >
> > The implementation is a pattern matching style so surely it won't be
> > able to cover all cases.
> >
> > Do you have specific examples which has verification failure due to
> > false array out of bound access?
>
> Not in a small example. =(
>
> This bug has an example, but it was part of a larger program:
> https://github.com/google/ghost-userspace/issues/31
>
> The rough progression was:
> - sometimes the compiler optimizes out the checks. So we added a macro
> to make the compiler not know the value of the variable anymore.
> - then, the compiler would occasionally do the check on a copy of the
> register, so we did the comparison and index operation all in assembly.
>
>
> I tried using bpf_cmp_likely() in my actual program (not just a one-off
> test), and still had a verifier issue. It's a large and convoluted
> program, so it might be hard to get a small reproducer. But it a
> different compiler issue than the one you mentioned.
>
> Specifically, I swapped out my array-access-macro for this one, using
> bpf_cmp_likely():
>
> #define bpf_array_elem(arr, arr_sz, idx) ({ \
> typeof(&(arr)[0]) ___arr = arr; \
> typeof(&(arr)[0]) ___ret = 0; \
> u64 ___idx = idx; \
> if (___arr && bpf_cmp_likely(___idx, <, arr_sz)) \
> ___ret = &___arr[___idx];\
> ___ret; \
> })
>
> which should be the same logic as before:
>
> * if (!arr)
> * return NULL;
> * if (idx >= arr_sz)
> * return NULL;
> * return &arr[idx];
>
> The issue I run into is different than the one you had. The compiler
> did the bounds check, but then for some reason recreated the index. The
> index is coming from another map.
>
> Arguably, the verifier is doing its job - that value could have changed.
> I just don't want the compiler to do the reread or any other
> shenanigans in between the bounds check and the usage.
>
> The guts of the error:
> - r0 is the map (L127)
> - r1 is the index, read from another map (L128)
> - r1 gets verified to be less than 0x200 (L129)
> - some other stuff happens
> - r1 gets read again, and is no longer bound (L132)
> - r1 gets scaled up by 896.
> (896*0x200 = 0x70000, would be the real bound, but r1 lost the 0x200
> bound)
> - r0 indexed by the bad r1 (L134)
> - blow up (L143)
>
> 127: (15) if r0 == 0x0 goto pc+1218 ;
> R0=map_value(off=0,ks=4,vs=458752,imm=0)
>
> 128: (79) r1 = *(u64 *)(r10 -40) ;
> R1_w=Pscalar(umax=4294967295,var_off=(0x0; 0xffffffff)) R10=fp0
>
> 129: (35) if r1 >= 0x200 goto pc+1216 ;
> R1_w=Pscalar(umax=511,var_off=(0x0; 0x1ff))
>
> 130: (79) r4 = *(u64 *)(r10 -56) ; R4_w=Pscalar() R10=fp0;
>
> 131: (37) r4 /= 1000 ; R4_w=Pscalar()
>
> 132: (79) r1 = *(u64 *)(r10 -40) ;
> R1_w=Pscalar(umax=4294967295,var_off=(0x0; 0xffffffff)) R10=fp0;
>
> 133: (27) r1 *= 896 ;
> R1_w=Pscalar(umax=3848290696320,var_off=(0x0;
> 0x3ffffffff80),s32_max=2147483520,u32_max=-128)
>
> 134: (0f) r0 += r1 ;
> R0_w=map_value(off=0,ks=4,vs=458752,umax=3848290696320,var_off=(0x0;
> 0x3ffffffff80),s32_max=2147483520,u32_max=-128)
> R1_w=Pscalar(umax=3848290696320,var_off=(0x0;
> 0x3ffffffff80),s32_max=2147483520,u32_max=-128)
>
> 135: (79) r3 = *(u64 *)(r10 -48) ;
> R3_w=map_value(off=0,ks=4,vs=15728640,imm=0) R10=fp0;
>
> 136: (0f) r3 += r8 ;
> R3_w=map_value(off=0,ks=4,vs=15728640,umax=15728400,var_off=(0x0;
> 0xfffff0),s32_max=16777200,u32_max=16777200)
> R8=Pscalar(umax=15728400,var_off=(0x0; 0xfffff0))
>
> 137: (61) r1 = *(u32 *)(r7 +16) ;
> R1_w=Pscalar(umax=4294967295,var_off=(0x0; 0xffffffff))
> R7=map_value(id=18779,off=0,ks=4,vs=224,imm=0)
>
> 138: (79) r2 = *(u64 *)(r3 +88) ; R2=Pscalar()
> R3=map_value(off=0,ks=4,vs=15728640,umax=15728400,var_off=(0x0;
> 0xfffff0),s32_max=16777200,u32_max=16777200)
>
> 139: (a5) if r1 < 0x9 goto pc+1 ;
> R1=Pscalar(umin=9,umax=4294967295,var_off=(0x0; 0xffffffff))
>
> 140: (b7) r1 = 0 ; R1_w=P0
>
> 141: (27) r1 *= 72 ; R1_w=P0
>
> 142: (0f) r0 += r1 ;
> R0_w=map_value(off=0,ks=4,vs=458752,umax=3848290696320,var_off=(0x0;
> 0x3ffffffff80),s32_max=2147483520,u32_max=-128) R1_w=P0
>
> 143: (7b) *(u64 *)(r0 +152) = r2
>
>
> if i put in a little ASM magic to tell the compiler to not recreate the
> index, it works, like so:
>
> #define BPF_MUST_CHECK(x) ({ asm volatile ("" : "+r"(x)); x; })
>
> #define bpf_array_elem(arr, arr_sz, idx) ({ \
> typeof(&(arr)[0]) ___arr = arr; \
> typeof(&(arr)[0]) ___ret = 0; \
> u64 ___idx = idx; \
> BPF_MUST_CHECK(___idx); \
> if (___arr && bpf_cmp_likely(___idx, <, arr_sz)) \
> ___ret = &___arr[___idx];\
> ___ret; \
> })
>
> though anecdotally, that only stops the "reread the index from its map"
> problem, similar to a READ_ONCE. the compiler is still free to just use
> another register for the check.
>
> The bit of ASM i had from a while back that did that was:
>
> * r2 = r8
> * r2 <<= 32
>
> * r2 >>= 32
> * if r2 > 0x3ff goto pc+29
>
> * r8 <<= 32
>
> * r8 >>= 32
>
> * r8 <<= 6
>
> * r0 += r8
> * *(u64 *)(r0 +48) = r3
>
>
> where r2 was bounds checked, but r8 was used instead.
>
> I'll play around and see if I can come up with a selftest that can run
> into any of these "you did the check, but threw the check away" scenarios.

Before we add full asm bpf_array_elem() macros let's fully
understand the issue first. Maybe it's a llvm deficiency
or verifier miss that can be addressed.
asm everywhere isn't a viable approach long term.

First start with:
asm volatile ("" : "+r"((short)x));

It will avoid unnecessary <<=32, >>=32 in -mcpu=v3,v4.

Then do:
if (likely(___arr) && bpf_cmp_likely(___idx, <, arr_sz))
^^^
just to have the expected basic block layout,
because that's what your asm does.

And, of course, a selftest is necessary to debug this further.

2024-01-10 01:02:30

by Barret Rhoden

[permalink] [raw]

Subject: Re: [PATCH v2 bpf-next 2/2] selftests/bpf: add inline assembly helpers to access array elements

On 1/4/24 16:30, Barret Rhoden wrote:
[snip]
>>
>> The LLVM bpf backend has made some improvement to handle the case like
>>    r1 = ...
>>    r2 = r1 + 1
>>    if (r2 < num) ...
>>    using r1
>> by preventing generating the above code pattern.
>>
>> The implementation is a pattern matching style so surely it won't be
>> able to cover all cases.
>>
>> Do you have specific examples which has verification failure due to
>> false array out of bound access?
>
[ snip ]

>
> I'll play around and see if I can come up with a selftest that can run
> into any of these "you did the check, but threw the check away" scenarios.

I got an example for this, and will include it in my next patch version,
which I'll CC you on.

If we can get the compiler to spill the register r1 to the stack (L11 in
the asm below), it might spill it before doing the bounds check. Then
it checks the register (L12), but the verifier doesn't know that applies
to the stack variable too. Later, we refill r1 from the stack (L21).

The reason for the spill was that I made another bpf_map_lookup_elem()
call (L19), which needed r1 as an argument.

11: (63) *(u32 *)(r10 -8) = r1 ;
R1=scalar(smin=0,smax=umax=0xffffffff,var_off=(0x0; 0xffffffff)) R10=fp0
fp-8=????scalar(smin=0,smax=umax=0xffffffff,var_off=(0x0; 0xffffffff))

12: (35) if r1 >= 0x64 goto pc+13 ;
R1=scalar(smin=smin32=0,smax=umax=smax32=umax32=99,var_off=(0x0; 0x7f))

13: (b4) w1 = 0 ; R1_w=0

14: (63) *(u32 *)(r10 -4) = r1 ; R1_w=0 R10=fp0 fp-8=0000mmmm

15: (bf) r2 = r10 ; R2_w=fp0 R10=fp0

16: (07) r2 += -4 ; R2_w=fp-4

17: (18) r1 = 0xffffc9000011edf0 ;
R1_w=map_ptr(map=arraymap,ks=4,vs=400)

19: (85) call bpf_map_lookup_elem#1 ;
R0_w=map_value_or_null(id=2,map=arraymap,ks=4,vs=400)

20: (15) if r0 == 0x0 goto pc+5 ;
R0_w=map_value(map=arraymap,ks=4,vs=400)

21: (61) r1 = *(u32 *)(r10 -8) ;
R1_w=scalar(smin=0,smax=umax=0xffffffff,var_off=(0x0; 0xffffffff))
R10=fp0 fp-8=mmmmmmmm

Thanks,
Barret

2024-01-10 01:07:24

by Alexei Starovoitov

[permalink] [raw]

Subject: Re: [PATCH v2 bpf-next 2/2] selftests/bpf: add inline assembly helpers to access array elements

On Tue, Jan 9, 2024 at 5:02 PM Barret Rhoden <[email protected]> wrote:
>
> On 1/4/24 16:30, Barret Rhoden wrote:
> [snip]
> >>
> >> The LLVM bpf backend has made some improvement to handle the case like
> >> r1 = ...
> >> r2 = r1 + 1
> >> if (r2 < num) ...
> >> using r1
> >> by preventing generating the above code pattern.
> >>
> >> The implementation is a pattern matching style so surely it won't be
> >> able to cover all cases.
> >>
> >> Do you have specific examples which has verification failure due to
> >> false array out of bound access?
> >
> [ snip ]
>
> >
> > I'll play around and see if I can come up with a selftest that can run
> > into any of these "you did the check, but threw the check away" scenarios.
>
> I got an example for this, and will include it in my next patch version,
> which I'll CC you on.
>
> If we can get the compiler to spill the register r1 to the stack (L11 in
> the asm below), it might spill it before doing the bounds check. Then
> it checks the register (L12), but the verifier doesn't know that applies
> to the stack variable too. Later, we refill r1 from the stack (L21).

This is a known issue.
It's addressed as part of Maxim's series:
https://patchwork.kernel.org/user/todo/netdevbpf/?series=815208

2024-01-10 01:20:58

by Barret Rhoden

[permalink] [raw]

Subject: Re: [PATCH v2 bpf-next 2/2] selftests/bpf: add inline assembly helpers to access array elements

On 1/9/24 20:06, Alexei Starovoitov wrote:
> This is a known issue.
> It's addressed as part of Maxim's series:
> https://patchwork.kernel.org/user/todo/netdevbpf/?series=815208

great. feel free to drop my patch. but hopefully people find it useful
in their own programs until maxim's series is widely available.

i'll keep using them for the time being, since the "spill before
verifying" was just one of many things i've seen. that, and it takes me
a while to get new kernel features in production. =)

thanks,

barret