2021-11-03 09:52:50

by Alan Maguire

[permalink] [raw]
Subject: [PATCH bpf-next 0/2] arm64/bpf: remove 128MB limit for BPF JIT programs

There is a 128MB limit on BPF JIT program allocations; this is
to ensure BPF programs are in branching range of each other.
Patch 1 in this series removes this restriction, but because
BPF exception handling used the contiguous nature of the JIT
region in fixing up exceptions, changes are needed in exception
handling. To verify exception handling still works, a test
case to validate exception handling in BPF programs is added in
patch 2.

There was previous discussion around this topic [1], in particular
would be good to get feedback from Daniel if this approach makes
sense.

[1] https://lore.kernel.org/all/[email protected]/

Alan Maguire (1):
selftests/bpf: add exception handling selftests for tp_bpf program

Russell King (1):
arm64/bpf: remove 128MB limit for BPF JIT programs

arch/arm64/include/asm/extable.h | 9 -----
arch/arm64/include/asm/memory.h | 5 +--
arch/arm64/kernel/traps.c | 2 +-
arch/arm64/mm/extable.c | 13 +++++--
arch/arm64/mm/ptdump.c | 2 -
arch/arm64/net/bpf_jit_comp.c | 10 +++--
tools/testing/selftests/bpf/prog_tests/exhandler.c | 45 ++++++++++++++++++++++
tools/testing/selftests/bpf/progs/exhandler_kern.c | 35 +++++++++++++++++
8 files changed, 97 insertions(+), 24 deletions(-)
create mode 100644 tools/testing/selftests/bpf/prog_tests/exhandler.c
create mode 100644 tools/testing/selftests/bpf/progs/exhandler_kern.c

--
1.8.3.1


2021-11-03 09:52:57

by Alan Maguire

[permalink] [raw]
Subject: [PATCH bpf-next 1/2] arm64/bpf: remove 128MB limit for BPF JIT programs

From: Russell King <[email protected]>

commit 91fc957c9b1d ("arm64/bpf: don't allocate BPF JIT programs in module memory")

...restricts BPF JIT program allocation to a 128MB region to ensure
BPF programs are still in branching range of each other. However
this restriction should not apply to the aarch64 JIT, since
BPF_JMP | BPF_CALL are implemented as a 64-bit move into a register
and then a BLR instruction - which has the effect of being able to call
anything without proximity limitation. Removing the contiguous JIT
region requires explicitly searching the bpf exception tables first
in fixup_exception(), since they are formatted differently from
the rest of the exception tables. Previously we used the fact that
the JIT memory was contiguous to identify the fact that the context
for the exception (the program counter) is a BPF program.

The practical reason to relax this restriction on JIT memory is that 128MB
of JIT memory can be quickly exhausted, especially where PAGE_SIZE is 64KB -
one page is needed per program. In cases where seccomp filters are applied
to multiple VMs on VM launch - such filters are classic BPF but converted to
BPF - this can severely limit the number of VMs that can be launched. In a
world where we support BPF JIT always on, turning off the JIT isn't always
an option either.

Fixes: 91fc957c9b1d ("arm64/bpf: don't allocate BPF JIT programs in module memory")

Suggested-by: Ard Biesheuvel <[email protected]>
Signed-off-by: Russell King <[email protected]>
Tested-by: Alan Maguire <[email protected]>
---
arch/arm64/include/asm/extable.h | 9 ---------
arch/arm64/include/asm/memory.h | 5 +----
arch/arm64/kernel/traps.c | 2 +-
arch/arm64/mm/extable.c | 13 +++++++++----
arch/arm64/mm/ptdump.c | 2 --
arch/arm64/net/bpf_jit_comp.c | 10 ++++++----
6 files changed, 17 insertions(+), 24 deletions(-)

diff --git a/arch/arm64/include/asm/extable.h b/arch/arm64/include/asm/extable.h
index b15eb4a..840a35e 100644
--- a/arch/arm64/include/asm/extable.h
+++ b/arch/arm64/include/asm/extable.h
@@ -22,15 +22,6 @@ struct exception_table_entry

#define ARCH_HAS_RELATIVE_EXTABLE

-static inline bool in_bpf_jit(struct pt_regs *regs)
-{
- if (!IS_ENABLED(CONFIG_BPF_JIT))
- return false;
-
- return regs->pc >= BPF_JIT_REGION_START &&
- regs->pc < BPF_JIT_REGION_END;
-}
-
#ifdef CONFIG_BPF_JIT
int arm64_bpf_fixup_exception(const struct exception_table_entry *ex,
struct pt_regs *regs);
diff --git a/arch/arm64/include/asm/memory.h b/arch/arm64/include/asm/memory.h
index f1745a8..0588632 100644
--- a/arch/arm64/include/asm/memory.h
+++ b/arch/arm64/include/asm/memory.h
@@ -44,11 +44,8 @@
#define _PAGE_OFFSET(va) (-(UL(1) << (va)))
#define PAGE_OFFSET (_PAGE_OFFSET(VA_BITS))
#define KIMAGE_VADDR (MODULES_END)
-#define BPF_JIT_REGION_START (_PAGE_END(VA_BITS_MIN))
-#define BPF_JIT_REGION_SIZE (SZ_128M)
-#define BPF_JIT_REGION_END (BPF_JIT_REGION_START + BPF_JIT_REGION_SIZE)
#define MODULES_END (MODULES_VADDR + MODULES_VSIZE)
-#define MODULES_VADDR (BPF_JIT_REGION_END)
+#define MODULES_VADDR (_PAGE_END(VA_BITS_MIN))
#define MODULES_VSIZE (SZ_128M)
#define VMEMMAP_START (-(UL(1) << (VA_BITS - VMEMMAP_SHIFT)))
#define VMEMMAP_END (VMEMMAP_START + VMEMMAP_SIZE)
diff --git a/arch/arm64/kernel/traps.c b/arch/arm64/kernel/traps.c
index b03e383..fe0cd05 100644
--- a/arch/arm64/kernel/traps.c
+++ b/arch/arm64/kernel/traps.c
@@ -988,7 +988,7 @@ static int bug_handler(struct pt_regs *regs, unsigned int esr)
static int reserved_fault_handler(struct pt_regs *regs, unsigned int esr)
{
pr_err("%s generated an invalid instruction at %pS!\n",
- in_bpf_jit(regs) ? "BPF JIT" : "Kernel text patching",
+ "Kernel text patching",
(void *)instruction_pointer(regs));

/* We cannot handle this */
diff --git a/arch/arm64/mm/extable.c b/arch/arm64/mm/extable.c
index aa00601..60a8b6a 100644
--- a/arch/arm64/mm/extable.c
+++ b/arch/arm64/mm/extable.c
@@ -9,14 +9,19 @@
int fixup_exception(struct pt_regs *regs)
{
const struct exception_table_entry *fixup;
+ unsigned long addr;

- fixup = search_exception_tables(instruction_pointer(regs));
- if (!fixup)
- return 0;
+ addr = instruction_pointer(regs);

- if (in_bpf_jit(regs))
+ /* Search the BPF tables first, these are formatted differently */
+ fixup = search_bpf_extables(addr);
+ if (fixup)
return arm64_bpf_fixup_exception(fixup, regs);

+ fixup = search_exception_tables(addr);
+ if (!fixup)
+ return 0;
+
regs->pc = (unsigned long)&fixup->fixup + fixup->fixup;
return 1;
}
diff --git a/arch/arm64/mm/ptdump.c b/arch/arm64/mm/ptdump.c
index 1c40353..9bc4066 100644
--- a/arch/arm64/mm/ptdump.c
+++ b/arch/arm64/mm/ptdump.c
@@ -41,8 +41,6 @@ enum address_markers_idx {
{ 0 /* KASAN_SHADOW_START */, "Kasan shadow start" },
{ KASAN_SHADOW_END, "Kasan shadow end" },
#endif
- { BPF_JIT_REGION_START, "BPF start" },
- { BPF_JIT_REGION_END, "BPF end" },
{ MODULES_VADDR, "Modules start" },
{ MODULES_END, "Modules end" },
{ VMALLOC_START, "vmalloc() area" },
diff --git a/arch/arm64/net/bpf_jit_comp.c b/arch/arm64/net/bpf_jit_comp.c
index 41c23f4..465c44d 100644
--- a/arch/arm64/net/bpf_jit_comp.c
+++ b/arch/arm64/net/bpf_jit_comp.c
@@ -1136,12 +1136,14 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *prog)
return prog;
}

+u64 bpf_jit_alloc_exec_limit(void)
+{
+ return VMALLOC_END - VMALLOC_START;
+}
+
void *bpf_jit_alloc_exec(unsigned long size)
{
- return __vmalloc_node_range(size, PAGE_SIZE, BPF_JIT_REGION_START,
- BPF_JIT_REGION_END, GFP_KERNEL,
- PAGE_KERNEL, 0, NUMA_NO_NODE,
- __builtin_return_address(0));
+ return vmalloc(size);
}

void bpf_jit_free_exec(void *addr)
--
1.8.3.1

2021-11-03 09:53:51

by Alan Maguire

[permalink] [raw]
Subject: [PATCH bpf-next 2/2] selftests/bpf: add exception handling selftests for tp_bpf program

Exception handling is triggered in BPF tracing programs when
a NULL pointer is dereferenced; the exception handler zeroes the
target register and execution of the BPF program progresses.

To test exception handling then, we need to trigger a NULL pointer
dereference for a field which should never be zero; if it is, the
only explanation is the exception handler ran. The skb->sk is
the NULL pointer chosen (for a ping received for 127.0.0.1 there
is no associated socket), and the sk_sndbuf size is chosen as the
"should never be 0" field. Test verifies sk is NULL and sk_sndbuf
is zero.

Signed-off-by: Alan Maguire <[email protected]>
---
tools/testing/selftests/bpf/prog_tests/exhandler.c | 45 ++++++++++++++++++++++
tools/testing/selftests/bpf/progs/exhandler_kern.c | 35 +++++++++++++++++
2 files changed, 80 insertions(+)
create mode 100644 tools/testing/selftests/bpf/prog_tests/exhandler.c
create mode 100644 tools/testing/selftests/bpf/progs/exhandler_kern.c

diff --git a/tools/testing/selftests/bpf/prog_tests/exhandler.c b/tools/testing/selftests/bpf/prog_tests/exhandler.c
new file mode 100644
index 0000000..5999498
--- /dev/null
+++ b/tools/testing/selftests/bpf/prog_tests/exhandler.c
@@ -0,0 +1,45 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright (c) 2021, Oracle and/or its affiliates. */
+
+#include <test_progs.h>
+
+/* Test that verifies exception handling is working; ping to localhost
+ * will result in a receive with a NULL skb->sk; our BPF program
+ * then dereferences the an sk field which shouldn't be 0, and if we
+ * see 0 we can conclude the exception handler ran when we attempted to
+ * dereference the NULL sk and zeroed the destination register.
+ */
+#include "exhandler_kern.skel.h"
+
+#define SYSTEM(...) \
+ (env.verbosity >= VERBOSE_VERY ? \
+ system(__VA_ARGS__) : system(__VA_ARGS__ " >/dev/null 2>&1"))
+
+void test_exhandler(void)
+{
+ struct exhandler_kern *skel;
+ struct exhandler_kern__bss *bss;
+ int err = 0, duration = 0;
+
+ skel = exhandler_kern__open_and_load();
+ if (CHECK(!skel, "skel_load", "skeleton failed: %d\n", err))
+ goto cleanup;
+
+ bss = skel->bss;
+
+ err = exhandler_kern__attach(skel);
+ if (CHECK(err, "attach", "attach failed: %d\n", err))
+ goto cleanup;
+
+ if (CHECK(SYSTEM("ping -c 1 127.0.0.1"),
+ "ping localhost",
+ "ping localhost failed\n"))
+ goto cleanup;
+
+ if (CHECK(bss->exception_triggered == 0,
+ "verify exceptions were triggered",
+ "no exceptions were triggered\n"))
+ goto cleanup;
+cleanup:
+ exhandler_kern__destroy(skel);
+}
diff --git a/tools/testing/selftests/bpf/progs/exhandler_kern.c b/tools/testing/selftests/bpf/progs/exhandler_kern.c
new file mode 100644
index 0000000..4049450
--- /dev/null
+++ b/tools/testing/selftests/bpf/progs/exhandler_kern.c
@@ -0,0 +1,35 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright (c) 2021, Oracle and/or its affiliates. */
+
+#include "vmlinux.h"
+
+#include <bpf/bpf_helpers.h>
+#include <bpf/bpf_tracing.h>
+#include <bpf/bpf_core_read.h>
+
+char _license[] SEC("license") = "GPL";
+
+unsigned int exception_triggered;
+
+/* TRACE_EVENT(netif_rx,
+ * TP_PROTO(struct sk_buff *skb),
+ */
+SEC("tp_btf/netif_rx")
+int BPF_PROG(trace_netif_rx, struct sk_buff *skb)
+{
+ struct sock *sk;
+ int sndbuf;
+
+ /* To verify we hit an exception we dereference skb->sk->sk_sndbuf;
+ * sndbuf size should never be zero, so if it is we know the exception
+ * handler triggered and zeroed the destination register.
+ */
+ __builtin_preserve_access_index(({
+ sk = skb->sk;
+ sndbuf = sk->sk_sndbuf;
+ }));
+
+ if (!sk && !sndbuf)
+ exception_triggered++;
+ return 0;
+}
--
1.8.3.1

2021-11-03 18:41:11

by Andrii Nakryiko

[permalink] [raw]
Subject: Re: [PATCH bpf-next 2/2] selftests/bpf: add exception handling selftests for tp_bpf program

On Wed, Nov 3, 2021 at 2:50 AM Alan Maguire <[email protected]> wrote:
>
> Exception handling is triggered in BPF tracing programs when
> a NULL pointer is dereferenced; the exception handler zeroes the
> target register and execution of the BPF program progresses.
>
> To test exception handling then, we need to trigger a NULL pointer
> dereference for a field which should never be zero; if it is, the
> only explanation is the exception handler ran. The skb->sk is
> the NULL pointer chosen (for a ping received for 127.0.0.1 there
> is no associated socket), and the sk_sndbuf size is chosen as the
> "should never be 0" field. Test verifies sk is NULL and sk_sndbuf
> is zero.
>
> Signed-off-by: Alan Maguire <[email protected]>
> ---
> tools/testing/selftests/bpf/prog_tests/exhandler.c | 45 ++++++++++++++++++++++
> tools/testing/selftests/bpf/progs/exhandler_kern.c | 35 +++++++++++++++++
> 2 files changed, 80 insertions(+)
> create mode 100644 tools/testing/selftests/bpf/prog_tests/exhandler.c
> create mode 100644 tools/testing/selftests/bpf/progs/exhandler_kern.c
>
> diff --git a/tools/testing/selftests/bpf/prog_tests/exhandler.c b/tools/testing/selftests/bpf/prog_tests/exhandler.c
> new file mode 100644
> index 0000000..5999498
> --- /dev/null
> +++ b/tools/testing/selftests/bpf/prog_tests/exhandler.c
> @@ -0,0 +1,45 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/* Copyright (c) 2021, Oracle and/or its affiliates. */
> +
> +#include <test_progs.h>
> +
> +/* Test that verifies exception handling is working; ping to localhost
> + * will result in a receive with a NULL skb->sk; our BPF program
> + * then dereferences the an sk field which shouldn't be 0, and if we
> + * see 0 we can conclude the exception handler ran when we attempted to
> + * dereference the NULL sk and zeroed the destination register.
> + */
> +#include "exhandler_kern.skel.h"
> +
> +#define SYSTEM(...) \
> + (env.verbosity >= VERBOSE_VERY ? \
> + system(__VA_ARGS__) : system(__VA_ARGS__ " >/dev/null 2>&1"))
> +
> +void test_exhandler(void)
> +{
> + struct exhandler_kern *skel;
> + struct exhandler_kern__bss *bss;
> + int err = 0, duration = 0;
> +
> + skel = exhandler_kern__open_and_load();
> + if (CHECK(!skel, "skel_load", "skeleton failed: %d\n", err))
> + goto cleanup;
> +
> + bss = skel->bss;

nit: you don't need to have a separate variable for that,
skel->bss->exception_triggered in below check would be just as
readable

> +
> + err = exhandler_kern__attach(skel);
> + if (CHECK(err, "attach", "attach failed: %d\n", err))
> + goto cleanup;
> +
> + if (CHECK(SYSTEM("ping -c 1 127.0.0.1"),

Is there some other tracepoint or kernel function that could be used
for testing and triggered without shelling out to ping binary? This
hurts test isolation and will make it or some other ping-using
selftests spuriously fail when running in parallel test mode (i.e.,
sudo ./test_progs -j).

> + "ping localhost",
> + "ping localhost failed\n"))
> + goto cleanup;
> +
> + if (CHECK(bss->exception_triggered == 0,

please use ASSERT_EQ() instead, CHECK()s are kind of deprecated for new tests


> + "verify exceptions were triggered",
> + "no exceptions were triggered\n"))
> + goto cleanup;
> +cleanup:
> + exhandler_kern__destroy(skel);
> +}
> diff --git a/tools/testing/selftests/bpf/progs/exhandler_kern.c b/tools/testing/selftests/bpf/progs/exhandler_kern.c
> new file mode 100644
> index 0000000..4049450
> --- /dev/null
> +++ b/tools/testing/selftests/bpf/progs/exhandler_kern.c
> @@ -0,0 +1,35 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/* Copyright (c) 2021, Oracle and/or its affiliates. */
> +
> +#include "vmlinux.h"
> +
> +#include <bpf/bpf_helpers.h>
> +#include <bpf/bpf_tracing.h>
> +#include <bpf/bpf_core_read.h>
> +
> +char _license[] SEC("license") = "GPL";
> +
> +unsigned int exception_triggered;
> +
> +/* TRACE_EVENT(netif_rx,
> + * TP_PROTO(struct sk_buff *skb),
> + */
> +SEC("tp_btf/netif_rx")
> +int BPF_PROG(trace_netif_rx, struct sk_buff *skb)
> +{
> + struct sock *sk;
> + int sndbuf;
> +
> + /* To verify we hit an exception we dereference skb->sk->sk_sndbuf;
> + * sndbuf size should never be zero, so if it is we know the exception
> + * handler triggered and zeroed the destination register.
> + */
> + __builtin_preserve_access_index(({
> + sk = skb->sk;
> + sndbuf = sk->sk_sndbuf;
> + }));

you don't need __builtin_preserve_access_index(({ }) region, because
vmlinux.h already annotates all the types with preserve_access_index
attribute

> +
> + if (!sk && !sndbuf)
> + exception_triggered++;
> + return 0;
> +}
> --
> 1.8.3.1
>

2021-11-04 23:02:14

by Alan Maguire

[permalink] [raw]
Subject: Re: [PATCH bpf-next 2/2] selftests/bpf: add exception handling selftests for tp_bpf program



On Wed, 3 Nov 2021, Andrii Nakryiko wrote:

> On Wed, Nov 3, 2021 at 2:50 AM Alan Maguire <[email protected]> wrote:
> >
> > Exception handling is triggered in BPF tracing programs when
> > a NULL pointer is dereferenced; the exception handler zeroes the
> > target register and execution of the BPF program progresses.
> >
> > To test exception handling then, we need to trigger a NULL pointer
> > dereference for a field which should never be zero; if it is, the
> > only explanation is the exception handler ran. The skb->sk is
> > the NULL pointer chosen (for a ping received for 127.0.0.1 there
> > is no associated socket), and the sk_sndbuf size is chosen as the
> > "should never be 0" field. Test verifies sk is NULL and sk_sndbuf
> > is zero.
> >
> > Signed-off-by: Alan Maguire <[email protected]>
> > ---
> > tools/testing/selftests/bpf/prog_tests/exhandler.c | 45 ++++++++++++++++++++++
> > tools/testing/selftests/bpf/progs/exhandler_kern.c | 35 +++++++++++++++++
> > 2 files changed, 80 insertions(+)
> > create mode 100644 tools/testing/selftests/bpf/prog_tests/exhandler.c
> > create mode 100644 tools/testing/selftests/bpf/progs/exhandler_kern.c
> >
> > diff --git a/tools/testing/selftests/bpf/prog_tests/exhandler.c b/tools/testing/selftests/bpf/prog_tests/exhandler.c
> > new file mode 100644
> > index 0000000..5999498
> > --- /dev/null
> > +++ b/tools/testing/selftests/bpf/prog_tests/exhandler.c
<snip>
> > +
> > + bss = skel->bss;
>
> nit: you don't need to have a separate variable for that,
> skel->bss->exception_triggered in below check would be just as
> readable
>

sure, will do.

> > +
> > + err = exhandler_kern__attach(skel);
> > + if (CHECK(err, "attach", "attach failed: %d\n", err))
> > + goto cleanup;
> > +
> > + if (CHECK(SYSTEM("ping -c 1 127.0.0.1"),
>
> Is there some other tracepoint or kernel function that could be used
> for testing and triggered without shelling out to ping binary? This
> hurts test isolation and will make it or some other ping-using
> selftests spuriously fail when running in parallel test mode (i.e.,
> sudo ./test_progs -j).

I've got a new version of this working which uses a fork() in
combination with tp_btf/task_newtask ; the new task will have
a NULL task->task_works pointer, but if it wasn't NULL it
would have to point at a struct callback_head containing a
non-NULL callback function. So we can verify that
task->task_works and task->task_works->func are NULL to ensure
exception triggered instead. That should interfere
less with other parallel tests hopefully?

>
> > + "ping localhost",
> > + "ping localhost failed\n"))
> > + goto cleanup;
> > +
> > + if (CHECK(bss->exception_triggered == 0,
>
> please use ASSERT_EQ() instead, CHECK()s are kind of deprecated for new tests
>


sure, will do.

> > diff --git a/tools/testing/selftests/bpf/progs/exhandler_kern.c b/tools/testing/selftests/bpf/progs/exhandler_kern.c
> > new file mode 100644
> > index 0000000..4049450
> > --- /dev/null
> > +++ b/tools/testing/selftests/bpf/progs/exhandler_kern.c
> > @@ -0,0 +1,35 @@
> > +// SPDX-License-Identifier: GPL-2.0
> > +/* Copyright (c) 2021, Oracle and/or its affiliates. */
> > +
> > +#include "vmlinux.h"
> > +
> > +#include <bpf/bpf_helpers.h>
> > +#include <bpf/bpf_tracing.h>
> > +#include <bpf/bpf_core_read.h>
> > +
> > +char _license[] SEC("license") = "GPL";
> > +
> > +unsigned int exception_triggered;
> > +
> > +/* TRACE_EVENT(netif_rx,
> > + * TP_PROTO(struct sk_buff *skb),
> > + */
> > +SEC("tp_btf/netif_rx")
> > +int BPF_PROG(trace_netif_rx, struct sk_buff *skb)
> > +{
> > + struct sock *sk;
> > + int sndbuf;
> > +
> > + /* To verify we hit an exception we dereference skb->sk->sk_sndbuf;
> > + * sndbuf size should never be zero, so if it is we know the exception
> > + * handler triggered and zeroed the destination register.
> > + */
> > + __builtin_preserve_access_index(({
> > + sk = skb->sk;
> > + sndbuf = sk->sk_sndbuf;
> > + }));
>
> you don't need __builtin_preserve_access_index(({ }) region, because
> vmlinux.h already annotates all the types with preserve_access_index
> attribute
>

ah, great, I missed that somehow. Thanks!

Alan

2021-11-04 23:35:26

by Andrii Nakryiko

[permalink] [raw]
Subject: Re: [PATCH bpf-next 2/2] selftests/bpf: add exception handling selftests for tp_bpf program

On Thu, Nov 4, 2021 at 3:56 PM Alan Maguire <[email protected]> wrote:
>
>
>
> On Wed, 3 Nov 2021, Andrii Nakryiko wrote:
>
> > On Wed, Nov 3, 2021 at 2:50 AM Alan Maguire <[email protected]> wrote:
> > >
> > > Exception handling is triggered in BPF tracing programs when
> > > a NULL pointer is dereferenced; the exception handler zeroes the
> > > target register and execution of the BPF program progresses.
> > >
> > > To test exception handling then, we need to trigger a NULL pointer
> > > dereference for a field which should never be zero; if it is, the
> > > only explanation is the exception handler ran. The skb->sk is
> > > the NULL pointer chosen (for a ping received for 127.0.0.1 there
> > > is no associated socket), and the sk_sndbuf size is chosen as the
> > > "should never be 0" field. Test verifies sk is NULL and sk_sndbuf
> > > is zero.
> > >
> > > Signed-off-by: Alan Maguire <[email protected]>
> > > ---
> > > tools/testing/selftests/bpf/prog_tests/exhandler.c | 45 ++++++++++++++++++++++
> > > tools/testing/selftests/bpf/progs/exhandler_kern.c | 35 +++++++++++++++++
> > > 2 files changed, 80 insertions(+)
> > > create mode 100644 tools/testing/selftests/bpf/prog_tests/exhandler.c
> > > create mode 100644 tools/testing/selftests/bpf/progs/exhandler_kern.c
> > >
> > > diff --git a/tools/testing/selftests/bpf/prog_tests/exhandler.c b/tools/testing/selftests/bpf/prog_tests/exhandler.c
> > > new file mode 100644
> > > index 0000000..5999498
> > > --- /dev/null
> > > +++ b/tools/testing/selftests/bpf/prog_tests/exhandler.c
> <snip>
> > > +
> > > + bss = skel->bss;
> >
> > nit: you don't need to have a separate variable for that,
> > skel->bss->exception_triggered in below check would be just as
> > readable
> >
>
> sure, will do.
>
> > > +
> > > + err = exhandler_kern__attach(skel);
> > > + if (CHECK(err, "attach", "attach failed: %d\n", err))
> > > + goto cleanup;
> > > +
> > > + if (CHECK(SYSTEM("ping -c 1 127.0.0.1"),
> >
> > Is there some other tracepoint or kernel function that could be used
> > for testing and triggered without shelling out to ping binary? This
> > hurts test isolation and will make it or some other ping-using
> > selftests spuriously fail when running in parallel test mode (i.e.,
> > sudo ./test_progs -j).
>
> I've got a new version of this working which uses a fork() in
> combination with tp_btf/task_newtask ; the new task will have
> a NULL task->task_works pointer, but if it wasn't NULL it
> would have to point at a struct callback_head containing a
> non-NULL callback function. So we can verify that
> task->task_works and task->task_works->func are NULL to ensure
> exception triggered instead. That should interfere
> less with other parallel tests hopefully?

Yeah, tracing a fork would be better, thanks!. Make sure you are
filtering by pid, to avoid accidentally tripping on some unrelated
fork.

>
> >
> > > + "ping localhost",
> > > + "ping localhost failed\n"))
> > > + goto cleanup;
> > > +
> > > + if (CHECK(bss->exception_triggered == 0,
> >
> > please use ASSERT_EQ() instead, CHECK()s are kind of deprecated for new tests
> >
>
>
> sure, will do.
>
> > > diff --git a/tools/testing/selftests/bpf/progs/exhandler_kern.c b/tools/testing/selftests/bpf/progs/exhandler_kern.c
> > > new file mode 100644
> > > index 0000000..4049450
> > > --- /dev/null
> > > +++ b/tools/testing/selftests/bpf/progs/exhandler_kern.c
> > > @@ -0,0 +1,35 @@
> > > +// SPDX-License-Identifier: GPL-2.0
> > > +/* Copyright (c) 2021, Oracle and/or its affiliates. */
> > > +
> > > +#include "vmlinux.h"
> > > +
> > > +#include <bpf/bpf_helpers.h>
> > > +#include <bpf/bpf_tracing.h>
> > > +#include <bpf/bpf_core_read.h>
> > > +
> > > +char _license[] SEC("license") = "GPL";
> > > +
> > > +unsigned int exception_triggered;
> > > +
> > > +/* TRACE_EVENT(netif_rx,
> > > + * TP_PROTO(struct sk_buff *skb),
> > > + */
> > > +SEC("tp_btf/netif_rx")
> > > +int BPF_PROG(trace_netif_rx, struct sk_buff *skb)
> > > +{
> > > + struct sock *sk;
> > > + int sndbuf;
> > > +
> > > + /* To verify we hit an exception we dereference skb->sk->sk_sndbuf;
> > > + * sndbuf size should never be zero, so if it is we know the exception
> > > + * handler triggered and zeroed the destination register.
> > > + */
> > > + __builtin_preserve_access_index(({
> > > + sk = skb->sk;
> > > + sndbuf = sk->sk_sndbuf;
> > > + }));
> >
> > you don't need __builtin_preserve_access_index(({ }) region, because
> > vmlinux.h already annotates all the types with preserve_access_index
> > attribute
> >
>
> ah, great, I missed that somehow. Thanks!
>
> Alan