2020-03-03 14:12:36

by KP Singh

[permalink] [raw]
Subject: [PATCH bpf-next 0/7] Introduce BPF_MODIFY_RET tracing progs.

From: KP Singh <[email protected]>

This was brought up in the KRSI v4 discussion and found to be useful
both for security and tracing programs.

https://lore.kernel.org/bpf/[email protected]/

The modify_return programs are allowed for security hooks (with an
extra CAP_MAC_ADMIN check) and functions whitelisted for error
injection (ALLOW_ERROR_INJECTION).

The "security_" check is expected to be cleaned up with the KRSI patch
series.

Here is an example of how a fmod_ret program behaves:

int func_to_be_attached(int a, int b)
{ <--- do_fentry

do_fmod_ret:
<update ret by calling fmod_ret>
if (ret != 0)
goto do_fexit;

original_function:

<side_effects_happen_here>

} <--- do_fexit

ALLOW_ERROR_INJECTION(func_to_be_attached, ERRNO)

The fmod_ret program attached to this function can be defined as:

SEC("fmod_ret/func_to_be_attached")
BPF_PROG(func_name, int a, int b, int ret)
{
// This will skip the original function logic.
return -1;
}

KP Singh (7):
bpf: Refactor trampoline update code
bpf: JIT helpers for fmod_ret progs
bpf: Introduce BPF_MODIFY_RETURN
bpf: Attachment verification for BPF_MODIFY_RETURN
tools/libbpf: Add support for BPF_MODIFY_RETURN
bpf: Add test ops for BPF_PROG_TYPE_TRACING
bpf: Add selftests for BPF_MODIFY_RETURN

arch/x86/net/bpf_jit_comp.c | 261 +++++++++++++-----
include/linux/bpf.h | 24 +-
include/uapi/linux/bpf.h | 1 +
kernel/bpf/bpf_struct_ops.c | 13 +-
kernel/bpf/btf.c | 27 +-
kernel/bpf/syscall.c | 1 +
kernel/bpf/trampoline.c | 66 +++--
kernel/bpf/verifier.c | 32 +++
kernel/trace/bpf_trace.c | 1 +
net/bpf/test_run.c | 57 +++-
tools/include/uapi/linux/bpf.h | 1 +
tools/lib/bpf/libbpf.c | 4 +
.../selftests/bpf/prog_tests/fentry_fexit.c | 12 +-
.../selftests/bpf/prog_tests/fentry_test.c | 14 +-
.../selftests/bpf/prog_tests/fexit_test.c | 69 ++---
.../selftests/bpf/prog_tests/modify_return.c | 65 +++++
.../selftests/bpf/progs/modify_return.c | 49 ++++
17 files changed, 509 insertions(+), 188 deletions(-)
create mode 100644 tools/testing/selftests/bpf/prog_tests/modify_return.c
create mode 100644 tools/testing/selftests/bpf/progs/modify_return.c

--
2.20.1


2020-03-03 14:13:00

by KP Singh

[permalink] [raw]
Subject: [PATCH bpf-next 7/7] bpf: Add selftests for BPF_MODIFY_RETURN

From: KP Singh <[email protected]>

Test for two scenarios:

* When the fmod_ret program returns 0, the original function should
be called along with fentry and fexit programs.
* When the fmod_ret program returns a non-zero value, the original
function should not be called, no side effect should be observed and
fentry and fexit programs should be called.

The result from the kernel function call and whether a side-effect is
observed is returned via the retval attr of the BPF_PROG_TEST_RUN (bpf)
syscall.

Signed-off-by: KP Singh <[email protected]>
---
net/bpf/test_run.c | 23 ++++++-
.../selftests/bpf/prog_tests/modify_return.c | 65 +++++++++++++++++++
.../selftests/bpf/progs/modify_return.c | 49 ++++++++++++++
3 files changed, 135 insertions(+), 2 deletions(-)
create mode 100644 tools/testing/selftests/bpf/prog_tests/modify_return.c
create mode 100644 tools/testing/selftests/bpf/progs/modify_return.c

diff --git a/net/bpf/test_run.c b/net/bpf/test_run.c
index fb54b45285b4..642b6a46210b 100644
--- a/net/bpf/test_run.c
+++ b/net/bpf/test_run.c
@@ -10,6 +10,7 @@
#include <net/bpf_sk_storage.h>
#include <net/sock.h>
#include <net/tcp.h>
+#include <linux/error-injection.h>

#define CREATE_TRACE_POINTS
#include <trace/events/bpf_test_run.h>
@@ -143,6 +144,14 @@ int noinline bpf_fentry_test6(u64 a, void *b, short c, int d, void *e, u64 f)
return a + (long)b + c + d + (long)e + f;
}

+int noinline bpf_modify_return_test(int a, int *b)
+{
+ *b += 1;
+ return a + *b;
+}
+
+ALLOW_ERROR_INJECTION(bpf_modify_return_test, ERRNO);
+
static void *bpf_test_init(const union bpf_attr *kattr, u32 size,
u32 headroom, u32 tailroom)
{
@@ -168,7 +177,9 @@ int bpf_prog_test_run_tracing(struct bpf_prog *prog,
const union bpf_attr *kattr,
union bpf_attr __user *uattr)
{
- int err = -EFAULT;
+ u16 side_effect = 0, ret = 0;
+ int b = 2, err = -EFAULT;
+ u32 retval = 0;

switch (prog->expected_attach_type) {
case BPF_TRACE_FENTRY:
@@ -181,12 +192,20 @@ int bpf_prog_test_run_tracing(struct bpf_prog *prog,
bpf_fentry_test6(16, (void *)17, 18, 19, (void *)20, 21) != 111)
goto out;
break;
+ case BPF_MODIFY_RETURN:
+ ret = bpf_modify_return_test(1, &b);
+ if (b != 2)
+ side_effect = 1;
+ break;
default:
goto out;
}

- return 0;
+ retval = (u32)side_effect << 16 | ret;
+ if (copy_to_user(&uattr->test.retval, &retval, sizeof(retval)))
+ goto out;

+ return 0;
out:
trace_bpf_test_finish(&err);
return err;
diff --git a/tools/testing/selftests/bpf/prog_tests/modify_return.c b/tools/testing/selftests/bpf/prog_tests/modify_return.c
new file mode 100644
index 000000000000..beab9a37f35c
--- /dev/null
+++ b/tools/testing/selftests/bpf/prog_tests/modify_return.c
@@ -0,0 +1,65 @@
+// SPDX-License-Identifier: GPL-2.0
+
+/*
+ * Copyright 2020 Google LLC.
+ */
+
+#include <test_progs.h>
+#include "modify_return.skel.h"
+
+#define LOWER(x) (x & 0xffff)
+#define UPPER(x) (x >> 16)
+
+
+static void run_test(__u32 input_retval, __u16 want_side_effect, __s16 want_ret)
+{
+ struct modify_return *skel = NULL;
+ int err, prog_fd;
+ __u32 duration = 0, retval;
+ __u16 side_effect;
+ __s16 ret;
+
+ skel = modify_return__open_and_load();
+ if (CHECK(!skel, "skel_load", "modify_return skeleton failed\n"))
+ goto cleanup;
+
+ err = modify_return__attach(skel);
+ if (CHECK(err, "modify_return", "attach failed: %d\n", err))
+ goto cleanup;
+
+ skel->bss->input_retval = input_retval;
+ prog_fd = bpf_program__fd(skel->progs.fmod_ret_test);
+ err = bpf_prog_test_run(prog_fd, 1, NULL, 0, NULL, 0,
+ &retval, &duration);
+
+ CHECK(err, "test_run", "err %d errno %d\n", err, errno);
+
+ side_effect = UPPER(retval);
+ ret = LOWER(retval);
+
+ CHECK(ret != want_ret, "test_run",
+ "unexpected ret: %d, expected: %d\n", ret, want_ret);
+ CHECK(side_effect != want_side_effect, "modify_return",
+ "unexpected side_effect: %d\n", side_effect);
+
+ CHECK(skel->bss->fentry_result != 1, "modify_return",
+ "fentry failed\n");
+ CHECK(skel->bss->fexit_result != 1, "modify_return",
+ "fexit failed\n");
+ CHECK(skel->bss->fmod_ret_result != 1, "modify_return",
+ "fmod_ret failed\n");
+
+cleanup:
+ modify_return__destroy(skel);
+}
+
+void test_modify_return(void)
+{
+ run_test(0 /* input_retval */,
+ 1 /* want_side_effect */,
+ 4 /* want_ret */);
+ run_test(-EINVAL /* input_retval */,
+ 0 /* want_side_effect */,
+ -EINVAL /* want_ret */);
+}
+
diff --git a/tools/testing/selftests/bpf/progs/modify_return.c b/tools/testing/selftests/bpf/progs/modify_return.c
new file mode 100644
index 000000000000..8b7466a15c6b
--- /dev/null
+++ b/tools/testing/selftests/bpf/progs/modify_return.c
@@ -0,0 +1,49 @@
+// SPDX-License-Identifier: GPL-2.0
+
+/*
+ * Copyright 2020 Google LLC.
+ */
+
+#include <linux/bpf.h>
+#include <bpf/bpf_helpers.h>
+#include <bpf/bpf_tracing.h>
+
+char _license[] SEC("license") = "GPL";
+
+static int sequence = 0;
+__s32 input_retval = 0;
+
+__u64 fentry_result = 0;
+SEC("fentry/bpf_modify_return_test")
+int BPF_PROG(fentry_test, int a, __u64 b)
+{
+ sequence++;
+ fentry_result = (sequence == 1);
+ return 0;
+}
+
+__u64 fmod_ret_result = 0;
+SEC("fmod_ret/bpf_modify_return_test")
+int BPF_PROG(fmod_ret_test, int a, int *b, int ret)
+{
+ sequence++;
+ /* This is the first fmod_ret program, the ret passed should be 0 */
+ fmod_ret_result = (sequence == 2 && ret == 0);
+ return input_retval;
+}
+
+__u64 fexit_result = 0;
+SEC("fexit/bpf_modify_return_test")
+int BPF_PROG(fexit_test, int a, __u64 b, int ret)
+{
+ sequence++;
+ /* If the input_reval is non-zero a successful modification should have
+ * occurred.
+ */
+ if (input_retval)
+ fexit_result = (sequence == 3 && ret == input_retval);
+ else
+ fexit_result = (sequence == 3 && ret == 4);
+
+ return 0;
+}
--
2.20.1

2020-03-03 14:13:20

by KP Singh

[permalink] [raw]
Subject: [PATCH bpf-next 5/7] tools/libbpf: Add support for BPF_MODIFY_RETURN

From: KP Singh <[email protected]>

Signed-off-by: KP Singh <[email protected]>
---
tools/lib/bpf/libbpf.c | 4 ++++
1 file changed, 4 insertions(+)

diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
index f8c4042e5855..223be01dc466 100644
--- a/tools/lib/bpf/libbpf.c
+++ b/tools/lib/bpf/libbpf.c
@@ -6288,6 +6288,10 @@ static const struct bpf_sec_def section_defs[] = {
.expected_attach_type = BPF_TRACE_FENTRY,
.is_attach_btf = true,
.attach_fn = attach_trace),
+ SEC_DEF("fmod_ret/", TRACING,
+ .expected_attach_type = BPF_MODIFY_RETURN,
+ .is_attach_btf = true,
+ .attach_fn = attach_trace),
SEC_DEF("fexit/", TRACING,
.expected_attach_type = BPF_TRACE_FEXIT,
.is_attach_btf = true,
--
2.20.1

2020-03-03 14:13:22

by KP Singh

[permalink] [raw]
Subject: [PATCH bpf-next 4/7] bpf: Attachment verification for BPF_MODIFY_RETURN

From: KP Singh <[email protected]>

- Functions that are whitlisted by for error injection i.e.
within_error_injection_list.
- Security hooks, this is expected to be cleaned up after the KRSI
patches introduce the LSM_HOOK macro:

https://lore.kernel.org/bpf/[email protected]/

- The attachment is currently limited to functions that return an int.
This can be extended later other types (e.g. PTR).

Signed-off-by: KP Singh <[email protected]>
---
kernel/bpf/btf.c | 28 ++++++++++++++++++++--------
kernel/bpf/verifier.c | 31 +++++++++++++++++++++++++++++++
2 files changed, 51 insertions(+), 8 deletions(-)

diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c
index 30841fb8b3c0..50080add2ab9 100644
--- a/kernel/bpf/btf.c
+++ b/kernel/bpf/btf.c
@@ -3710,14 +3710,26 @@ bool btf_ctx_access(int off, int size, enum bpf_access_type type,
nr_args--;
}

- if ((prog->expected_attach_type == BPF_TRACE_FEXIT ||
- prog->expected_attach_type == BPF_MODIFY_RETURN) &&
- arg == nr_args) {
- if (!t)
- /* Default prog with 5 args. 6th arg is retval. */
- return true;
- /* function return type */
- t = btf_type_by_id(btf, t->type);
+ if (arg == nr_args) {
+ if (prog->expected_attach_type == BPF_TRACE_FEXIT) {
+ if (!t)
+ return true;
+ t = btf_type_by_id(btf, t->type);
+ } else if (prog->expected_attach_type == BPF_MODIFY_RETURN) {
+ /* For now the BPF_MODIFY_RETURN can only be attached to
+ * functions that return an int.
+ */
+ if (!t)
+ return false;
+
+ t = btf_type_skip_modifiers(btf, t->type, NULL);
+ if (!btf_type_is_int(t)) {
+ bpf_log(log,
+ "ret type %s not allowed for fmod_ret\n",
+ btf_kind_str[BTF_INFO_KIND(t->info)]);
+ return false;
+ }
+ }
} else if (arg >= nr_args) {
bpf_log(log, "func '%s' doesn't have %d-th argument\n",
tname, arg + 1);
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 2460c8e6b5be..ae32517d4ccd 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -19,6 +19,7 @@
#include <linux/sort.h>
#include <linux/perf_event.h>
#include <linux/ctype.h>
+#include <linux/error-injection.h>

#include "disasm.h"

@@ -9800,6 +9801,33 @@ static int check_struct_ops_btf_id(struct bpf_verifier_env *env)

return 0;
}
+#define SECURITY_PREFIX "security_"
+
+static int check_attach_modify_return(struct bpf_verifier_env *env)
+{
+ struct bpf_prog *prog = env->prog;
+ unsigned long addr = (unsigned long) prog->aux->trampoline->func.addr;
+
+ if (within_error_injection_list(addr))
+ return 0;
+
+ /* This is expected to be cleaned up in the future with the KRSI effort
+ * introducing the LSM_HOOK macro for cleaning up lsm_hooks.h.
+ */
+ if (!strncmp(SECURITY_PREFIX, prog->aux->attach_func_name,
+ sizeof(SECURITY_PREFIX) - 1)) {
+
+ if (!capable(CAP_MAC_ADMIN))
+ return -EPERM;
+
+ return 0;
+ }
+
+ verbose(env, "fmod_ret attach_btf_id %u (%s) is not modifiable\n",
+ prog->aux->attach_btf_id, prog->aux->attach_func_name);
+
+ return -EINVAL;
+}

static int check_attach_btf_id(struct bpf_verifier_env *env)
{
@@ -10000,6 +10028,9 @@ static int check_attach_btf_id(struct bpf_verifier_env *env)
}
tr->func.addr = (void *)addr;
prog->aux->trampoline = tr;
+
+ if (prog->expected_attach_type == BPF_MODIFY_RETURN)
+ ret = check_attach_modify_return(env);
out:
mutex_unlock(&tr->mutex);
if (ret)
--
2.20.1

2020-03-03 14:13:52

by KP Singh

[permalink] [raw]
Subject: [PATCH bpf-next 2/7] bpf: JIT helpers for fmod_ret progs

From: KP Singh <[email protected]>

* Split the invoke_bpf program to prepare for special handling of
fmod_ret programs introduced in a subsequent patch.
* Move the definition of emit_cond_near_jump and emit_nops as they are
needed for fmod_ret.
* Refactor branch target alignment into its own function
align16_branch_target.

Signed-off-by: KP Singh <[email protected]>
---
arch/x86/net/bpf_jit_comp.c | 158 ++++++++++++++++++++----------------
1 file changed, 90 insertions(+), 68 deletions(-)

diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c
index 15c7d28bc05c..475e354c2e88 100644
--- a/arch/x86/net/bpf_jit_comp.c
+++ b/arch/x86/net/bpf_jit_comp.c
@@ -1361,35 +1361,100 @@ static void restore_regs(const struct btf_func_model *m, u8 **prog, int nr_args,
-(stack_size - i * 8));
}

+static int invoke_bpf_prog(const struct btf_func_model *m, u8 **pprog,
+ struct bpf_prog *p, int stack_size)
+{
+ u8 *prog = *pprog;
+ int cnt = 0;
+
+ if (emit_call(&prog, __bpf_prog_enter, prog))
+ return -EINVAL;
+ /* remember prog start time returned by __bpf_prog_enter */
+ emit_mov_reg(&prog, true, BPF_REG_6, BPF_REG_0);
+
+ /* arg1: lea rdi, [rbp - stack_size] */
+ EMIT4(0x48, 0x8D, 0x7D, -stack_size);
+ /* arg2: progs[i]->insnsi for interpreter */
+ if (!p->jited)
+ emit_mov_imm64(&prog, BPF_REG_2,
+ (long) p->insnsi >> 32,
+ (u32) (long) p->insnsi);
+ /* call JITed bpf program or interpreter */
+ if (emit_call(&prog, p->bpf_func, prog))
+ return -EINVAL;
+
+ /* arg1: mov rdi, progs[i] */
+ emit_mov_imm64(&prog, BPF_REG_1, (long) p >> 32,
+ (u32) (long) p);
+ /* arg2: mov rsi, rbx <- start time in nsec */
+ emit_mov_reg(&prog, true, BPF_REG_2, BPF_REG_6);
+ if (emit_call(&prog, __bpf_prog_exit, prog))
+ return -EINVAL;
+
+ *pprog = prog;
+ return 0;
+}
+
+static void emit_nops(u8 **pprog, unsigned int len)
+{
+ unsigned int i, noplen;
+ u8 *prog = *pprog;
+ int cnt = 0;
+
+ while (len > 0) {
+ noplen = len;
+
+ if (noplen > ASM_NOP_MAX)
+ noplen = ASM_NOP_MAX;
+
+ for (i = 0; i < noplen; i++)
+ EMIT1(ideal_nops[noplen][i]);
+ len -= noplen;
+ }
+
+ *pprog = prog;
+}
+
+/* From Intel 64 and IA-32 Architectures Optimization
+ * Reference Manual, 3.4.1.4 Code Alignment, Assembly/Compiler
+ * Coding Rule 11: All branch targets should be 16-byte
+ * aligned.
+ */
+static void align16_branch_target(u8 **pprog)
+{
+ u8 *target, *prog = *pprog;
+
+ target = PTR_ALIGN(prog, 16);
+ if (target != prog)
+ emit_nops(&prog, target - prog);
+ if (target != prog)
+ pr_err("calcultion error\n");
+}
+
+static int emit_cond_near_jump(u8 **pprog, void *func, void *ip, u8 jmp_cond)
+{
+ u8 *prog = *pprog;
+ int cnt = 0;
+ s64 offset;
+
+ offset = func - (ip + 2 + 4);
+ if (!is_simm32(offset)) {
+ pr_err("Target %p is out of range\n", func);
+ return -EINVAL;
+ }
+ EMIT2_off32(0x0F, jmp_cond + 0x10, offset);
+ *pprog = prog;
+ return 0;
+}
+
static int invoke_bpf(const struct btf_func_model *m, u8 **pprog,
struct bpf_tramp_progs *tp, int stack_size)
{
+ int i;
u8 *prog = *pprog;
- int cnt = 0, i;

for (i = 0; i < tp->nr_progs; i++) {
- if (emit_call(&prog, __bpf_prog_enter, prog))
- return -EINVAL;
- /* remember prog start time returned by __bpf_prog_enter */
- emit_mov_reg(&prog, true, BPF_REG_6, BPF_REG_0);
-
- /* arg1: lea rdi, [rbp - stack_size] */
- EMIT4(0x48, 0x8D, 0x7D, -stack_size);
- /* arg2: progs[i]->insnsi for interpreter */
- if (!tp->progs[i]->jited)
- emit_mov_imm64(&prog, BPF_REG_2,
- (long) tp->progs[i]->insnsi >> 32,
- (u32) (long) tp->progs[i]->insnsi);
- /* call JITed bpf program or interpreter */
- if (emit_call(&prog, tp->progs[i]->bpf_func, prog))
- return -EINVAL;
-
- /* arg1: mov rdi, progs[i] */
- emit_mov_imm64(&prog, BPF_REG_1, (long) tp->progs[i] >> 32,
- (u32) (long) tp->progs[i]);
- /* arg2: mov rsi, rbx <- start time in nsec */
- emit_mov_reg(&prog, true, BPF_REG_2, BPF_REG_6);
- if (emit_call(&prog, __bpf_prog_exit, prog))
+ if (invoke_bpf_prog(m, &prog, tp->progs[i], stack_size))
return -EINVAL;
}
*pprog = prog;
@@ -1531,42 +1596,6 @@ int arch_prepare_bpf_trampoline(void *image, void *image_end,
return prog - (u8 *)image;
}

-static int emit_cond_near_jump(u8 **pprog, void *func, void *ip, u8 jmp_cond)
-{
- u8 *prog = *pprog;
- int cnt = 0;
- s64 offset;
-
- offset = func - (ip + 2 + 4);
- if (!is_simm32(offset)) {
- pr_err("Target %p is out of range\n", func);
- return -EINVAL;
- }
- EMIT2_off32(0x0F, jmp_cond + 0x10, offset);
- *pprog = prog;
- return 0;
-}
-
-static void emit_nops(u8 **pprog, unsigned int len)
-{
- unsigned int i, noplen;
- u8 *prog = *pprog;
- int cnt = 0;
-
- while (len > 0) {
- noplen = len;
-
- if (noplen > ASM_NOP_MAX)
- noplen = ASM_NOP_MAX;
-
- for (i = 0; i < noplen; i++)
- EMIT1(ideal_nops[noplen][i]);
- len -= noplen;
- }
-
- *pprog = prog;
-}
-
static int emit_fallback_jump(u8 **pprog)
{
u8 *prog = *pprog;
@@ -1589,7 +1618,7 @@ static int emit_fallback_jump(u8 **pprog)

static int emit_bpf_dispatcher(u8 **pprog, int a, int b, s64 *progs)
{
- u8 *jg_reloc, *jg_target, *prog = *pprog;
+ u8 *jg_reloc, *prog = *pprog;
int pivot, err, jg_bytes = 1, cnt = 0;
s64 jg_offset;

@@ -1639,14 +1668,7 @@ static int emit_bpf_dispatcher(u8 **pprog, int a, int b, s64 *progs)
if (err)
return err;

- /* From Intel 64 and IA-32 Architectures Optimization
- * Reference Manual, 3.4.1.4 Code Alignment, Assembly/Compiler
- * Coding Rule 11: All branch targets should be 16-byte
- * aligned.
- */
- jg_target = PTR_ALIGN(prog, 16);
- if (jg_target != prog)
- emit_nops(&prog, jg_target - prog);
+ align16_branch_target(&prog);
jg_offset = prog - jg_reloc;
emit_code(jg_reloc - jg_bytes, jg_offset, jg_bytes);

--
2.20.1

2020-03-03 14:13:53

by KP Singh

[permalink] [raw]
Subject: [PATCH bpf-next 3/7] bpf: Introduce BPF_MODIFY_RETURN

From: KP Singh <[email protected]>

When multiple programs are attached, each program receives the return
value from the previous program on the stack and the last program
provides the return value to the attached function.

The fmod_ret bpf programs are run after the fentry programs and before
the fexit programs. The original function is only called if all the
fmod_ret programs return 0 to avoid any unintended side-effects. The
success value, i.e. 0 is not currently configurable but can be made so
where user-space can specify it at load time.

For example:

int func_to_be_attached(int a, int b)
{ <--- do_fentry

do_fmod_ret:
<update ret by calling fmod_ret>
if (ret != 0)
goto do_fexit;

original_function:

<side_effects_happen_here>

} <--- do_fexit

The fmod_ret program attached to this function can be defined as:

SEC("fmod_ret/func_to_be_attached")
BPF_PROG(func_name, int a, int b, int ret)
{
// This will skip the original function logic.
return 1;
}

The first fmod_ret program is passed 0 in its return argument.

Signed-off-by: KP Singh <[email protected]>
---
arch/x86/net/bpf_jit_comp.c | 96 ++++++++++++++++++++++++++++++++--
include/linux/bpf.h | 1 +
include/uapi/linux/bpf.h | 1 +
kernel/bpf/btf.c | 3 +-
kernel/bpf/syscall.c | 1 +
kernel/bpf/trampoline.c | 5 +-
kernel/bpf/verifier.c | 1 +
tools/include/uapi/linux/bpf.h | 1 +
8 files changed, 103 insertions(+), 6 deletions(-)

diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c
index 475e354c2e88..f16f78aeecf9 100644
--- a/arch/x86/net/bpf_jit_comp.c
+++ b/arch/x86/net/bpf_jit_comp.c
@@ -1362,7 +1362,7 @@ static void restore_regs(const struct btf_func_model *m, u8 **prog, int nr_args,
}

static int invoke_bpf_prog(const struct btf_func_model *m, u8 **pprog,
- struct bpf_prog *p, int stack_size)
+ struct bpf_prog *p, int stack_size, bool mod_ret)
{
u8 *prog = *pprog;
int cnt = 0;
@@ -1383,6 +1383,13 @@ static int invoke_bpf_prog(const struct btf_func_model *m, u8 **pprog,
if (emit_call(&prog, p->bpf_func, prog))
return -EINVAL;

+ /* BPF_TRAMP_MODIFY_RETURN trampolines can modify the return
+ * of the previous call which is then passed on the stack to
+ * the next BPF program.
+ */
+ if (mod_ret)
+ emit_stx(&prog, BPF_DW, BPF_REG_FP, BPF_REG_0, -8);
+
/* arg1: mov rdi, progs[i] */
emit_mov_imm64(&prog, BPF_REG_1, (long) p >> 32,
(u32) (long) p);
@@ -1447,6 +1454,23 @@ static int emit_cond_near_jump(u8 **pprog, void *func, void *ip, u8 jmp_cond)
return 0;
}

+static int emit_mod_ret_check_imm8(u8 **pprog, int value)
+{
+ u8 *prog = *pprog;
+ int cnt = 0;
+
+ if (!is_imm8(value))
+ return -EINVAL;
+
+ if (value == 0)
+ EMIT2(0x85, add_2reg(0xC0, BPF_REG_0, BPF_REG_0));
+ else
+ EMIT3(0x83, add_1reg(0xF8, BPF_REG_0), value);
+
+ *pprog = prog;
+ return 0;
+}
+
static int invoke_bpf(const struct btf_func_model *m, u8 **pprog,
struct bpf_tramp_progs *tp, int stack_size)
{
@@ -1454,13 +1478,53 @@ static int invoke_bpf(const struct btf_func_model *m, u8 **pprog,
u8 *prog = *pprog;

for (i = 0; i < tp->nr_progs; i++) {
- if (invoke_bpf_prog(m, &prog, tp->progs[i], stack_size))
+ if (invoke_bpf_prog(m, &prog, tp->progs[i], stack_size, false))
return -EINVAL;
}
*pprog = prog;
return 0;
}

+static int invoke_bpf_mod_ret(const struct btf_func_model *m, u8 **pprog,
+ struct bpf_tramp_progs *tp, int stack_size,
+ u8 **branches)
+{
+ u8 *prog = *pprog;
+ int i;
+
+ /* The first fmod_ret program will receive a garbage return value.
+ * Set this to 0 to avoid confusing the program.
+ */
+ emit_mov_imm32(&prog, false, BPF_REG_0, 0);
+ emit_stx(&prog, BPF_DW, BPF_REG_FP, BPF_REG_0, -8);
+ for (i = 0; i < tp->nr_progs; i++) {
+ if (invoke_bpf_prog(m, &prog, tp->progs[i], stack_size, true))
+ return -EINVAL;
+
+ /* Generate a branch:
+ *
+ * if (ret != 0)
+ * goto do_fexit;
+ *
+ * If needed this can be extended to any integer value which can
+ * be passed by user-space when the program is loaded.
+ */
+ if (emit_mod_ret_check_imm8(&prog, 0))
+ return -EINVAL;
+
+ /* Save the location of the branch and Generate 6 nops
+ * (4 bytes for an offset and 2 bytes for the jump) These nops
+ * are replaced with a conditional jump once do_fexit (i.e. the
+ * start of the fexit invocation) is finalized.
+ */
+ branches[i] = prog;
+ emit_nops(&prog, 4 + 2);
+ }
+
+ *pprog = prog;
+ return 0;
+}
+
/* Example:
* __be16 eth_type_trans(struct sk_buff *skb, struct net_device *dev);
* its 'struct btf_func_model' will be nr_args=2
@@ -1526,10 +1590,12 @@ int arch_prepare_bpf_trampoline(void *image, void *image_end,
struct bpf_tramp_progs *tprogs,
void *orig_call)
{
- int cnt = 0, nr_args = m->nr_args;
+ int i, cnt = 0, nr_args = m->nr_args;
int stack_size = nr_args * 8;
struct bpf_tramp_progs *fentry = &tprogs[BPF_TRAMP_FENTRY];
struct bpf_tramp_progs *fexit = &tprogs[BPF_TRAMP_FEXIT];
+ struct bpf_tramp_progs *fmod_ret = &tprogs[BPF_TRAMP_MODIFY_RETURN];
+ u8 **branches = NULL;
u8 *prog;

/* x86-64 supports up to 6 arguments. 7+ can be added in the future */
@@ -1562,8 +1628,18 @@ int arch_prepare_bpf_trampoline(void *image, void *image_end,
if (invoke_bpf(m, &prog, fentry, stack_size))
return -EINVAL;

+ if (fmod_ret->nr_progs) {
+ branches = kcalloc(fmod_ret->nr_progs, sizeof(u8 *),
+ GFP_KERNEL);
+ if (!branches)
+ return -ENOMEM;
+ if (invoke_bpf_mod_ret(m, &prog, fmod_ret, stack_size,
+ branches))
+ return -EINVAL;
+ }
+
if (flags & BPF_TRAMP_F_CALL_ORIG) {
- if (fentry->nr_progs)
+ if (fentry->nr_progs || fmod_ret->nr_progs)
restore_regs(m, &prog, nr_args, stack_size);

/* call original function */
@@ -1573,6 +1649,14 @@ int arch_prepare_bpf_trampoline(void *image, void *image_end,
emit_stx(&prog, BPF_DW, BPF_REG_FP, BPF_REG_0, -8);
}

+ if (fmod_ret->nr_progs) {
+ align16_branch_target(&prog);
+ for (i = 0; i < fmod_ret->nr_progs; i++)
+ emit_cond_near_jump(&branches[i], prog, branches[i],
+ X86_JNE);
+ kfree(branches);
+ }
+
if (fexit->nr_progs)
if (invoke_bpf(m, &prog, fexit, stack_size))
return -EINVAL;
@@ -1580,6 +1664,10 @@ int arch_prepare_bpf_trampoline(void *image, void *image_end,
if (flags & BPF_TRAMP_F_RESTORE_REGS)
restore_regs(m, &prog, nr_args, stack_size);

+ /* This needs to be done regardless. If there were fmod_ret programs,
+ * the return value is only updated on the stack and still needs to be
+ * restored to R0.
+ */
if (flags & BPF_TRAMP_F_CALL_ORIG)
/* restore original return value back into RAX */
emit_ldx(&prog, BPF_DW, BPF_REG_0, BPF_REG_FP, -8);
diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index 98ec10b23dbb..3cfdc216a2f4 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -473,6 +473,7 @@ void notrace __bpf_prog_exit(struct bpf_prog *prog, u64 start);

enum bpf_tramp_prog_type {
BPF_TRAMP_FENTRY,
+ BPF_TRAMP_MODIFY_RETURN,
BPF_TRAMP_FEXIT,
BPF_TRAMP_MAX,
BPF_TRAMP_REPLACE, /* more than MAX */
diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index 8e98ced0963b..b049f69925f5 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -210,6 +210,7 @@ enum bpf_attach_type {
BPF_TRACE_RAW_TP,
BPF_TRACE_FENTRY,
BPF_TRACE_FEXIT,
+ BPF_MODIFY_RETURN,
__MAX_BPF_ATTACH_TYPE
};

diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c
index 787140095e58..30841fb8b3c0 100644
--- a/kernel/bpf/btf.c
+++ b/kernel/bpf/btf.c
@@ -3710,7 +3710,8 @@ bool btf_ctx_access(int off, int size, enum bpf_access_type type,
nr_args--;
}

- if (prog->expected_attach_type == BPF_TRACE_FEXIT &&
+ if ((prog->expected_attach_type == BPF_TRACE_FEXIT ||
+ prog->expected_attach_type == BPF_MODIFY_RETURN) &&
arg == nr_args) {
if (!t)
/* Default prog with 5 args. 6th arg is retval. */
diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
index 13de65363ba2..7ce0815793dd 100644
--- a/kernel/bpf/syscall.c
+++ b/kernel/bpf/syscall.c
@@ -2324,6 +2324,7 @@ static int bpf_tracing_prog_attach(struct bpf_prog *prog)

if (prog->expected_attach_type != BPF_TRACE_FENTRY &&
prog->expected_attach_type != BPF_TRACE_FEXIT &&
+ prog->expected_attach_type != BPF_MODIFY_RETURN &&
prog->type != BPF_PROG_TYPE_EXT) {
err = -EINVAL;
goto out_put_prog;
diff --git a/kernel/bpf/trampoline.c b/kernel/bpf/trampoline.c
index 9daeb094f054..ebf0f2131848 100644
--- a/kernel/bpf/trampoline.c
+++ b/kernel/bpf/trampoline.c
@@ -233,7 +233,8 @@ static int bpf_trampoline_update(struct bpf_trampoline *tr)
goto out;
}

- if (tprogs[BPF_TRAMP_FEXIT].nr_progs)
+ if (tprogs[BPF_TRAMP_FEXIT].nr_progs ||
+ tprogs[BPF_TRAMP_MODIFY_RETURN].nr_progs)
flags = BPF_TRAMP_F_CALL_ORIG | BPF_TRAMP_F_SKIP_FRAME;

/* Though the second half of trampoline page is unused a task could be
@@ -270,6 +271,8 @@ static enum bpf_tramp_prog_type bpf_attach_type_to_tramp(enum bpf_attach_type t)
switch (t) {
case BPF_TRACE_FENTRY:
return BPF_TRAMP_FENTRY;
+ case BPF_MODIFY_RETURN:
+ return BPF_TRAMP_MODIFY_RETURN;
case BPF_TRACE_FEXIT:
return BPF_TRAMP_FEXIT;
default:
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 289383edfc8c..2460c8e6b5be 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -9950,6 +9950,7 @@ static int check_attach_btf_id(struct bpf_verifier_env *env)
if (!prog_extension)
return -EINVAL;
/* fallthrough */
+ case BPF_MODIFY_RETURN:
case BPF_TRACE_FENTRY:
case BPF_TRACE_FEXIT:
if (!btf_type_is_func(t)) {
diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h
index 906e9f2752db..c44ee9bf3479 100644
--- a/tools/include/uapi/linux/bpf.h
+++ b/tools/include/uapi/linux/bpf.h
@@ -210,6 +210,7 @@ enum bpf_attach_type {
BPF_TRACE_RAW_TP,
BPF_TRACE_FENTRY,
BPF_TRACE_FEXIT,
+ BPF_MODIFY_RETURN,
__MAX_BPF_ATTACH_TYPE
};

--
2.20.1

2020-03-03 14:13:54

by KP Singh

[permalink] [raw]
Subject: [PATCH bpf-next 1/7] bpf: Refactor trampoline update code

From: KP Singh <[email protected]>

As we need to introduce a third type of attachment for trampolines, the
flattened signature of arch_prepare_bpf_trampoline gets even more
complicated.

Refactor the prog and count argument to arch_prepare_bpf_trampoline to
use bpf_tramp_progs to simplify the addition and accounting for new
attachment types.

Signed-off-by: KP Singh <[email protected]>
---
arch/x86/net/bpf_jit_comp.c | 31 +++++++++---------
include/linux/bpf.h | 13 ++++++--
kernel/bpf/bpf_struct_ops.c | 13 +++++++-
kernel/bpf/trampoline.c | 63 +++++++++++++++++++++----------------
4 files changed, 75 insertions(+), 45 deletions(-)

diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c
index 9ba08e9abc09..15c7d28bc05c 100644
--- a/arch/x86/net/bpf_jit_comp.c
+++ b/arch/x86/net/bpf_jit_comp.c
@@ -1362,12 +1362,12 @@ static void restore_regs(const struct btf_func_model *m, u8 **prog, int nr_args,
}

static int invoke_bpf(const struct btf_func_model *m, u8 **pprog,
- struct bpf_prog **progs, int prog_cnt, int stack_size)
+ struct bpf_tramp_progs *tp, int stack_size)
{
u8 *prog = *pprog;
int cnt = 0, i;

- for (i = 0; i < prog_cnt; i++) {
+ for (i = 0; i < tp->nr_progs; i++) {
if (emit_call(&prog, __bpf_prog_enter, prog))
return -EINVAL;
/* remember prog start time returned by __bpf_prog_enter */
@@ -1376,17 +1376,17 @@ static int invoke_bpf(const struct btf_func_model *m, u8 **pprog,
/* arg1: lea rdi, [rbp - stack_size] */
EMIT4(0x48, 0x8D, 0x7D, -stack_size);
/* arg2: progs[i]->insnsi for interpreter */
- if (!progs[i]->jited)
+ if (!tp->progs[i]->jited)
emit_mov_imm64(&prog, BPF_REG_2,
- (long) progs[i]->insnsi >> 32,
- (u32) (long) progs[i]->insnsi);
+ (long) tp->progs[i]->insnsi >> 32,
+ (u32) (long) tp->progs[i]->insnsi);
/* call JITed bpf program or interpreter */
- if (emit_call(&prog, progs[i]->bpf_func, prog))
+ if (emit_call(&prog, tp->progs[i]->bpf_func, prog))
return -EINVAL;

/* arg1: mov rdi, progs[i] */
- emit_mov_imm64(&prog, BPF_REG_1, (long) progs[i] >> 32,
- (u32) (long) progs[i]);
+ emit_mov_imm64(&prog, BPF_REG_1, (long) tp->progs[i] >> 32,
+ (u32) (long) tp->progs[i]);
/* arg2: mov rsi, rbx <- start time in nsec */
emit_mov_reg(&prog, true, BPF_REG_2, BPF_REG_6);
if (emit_call(&prog, __bpf_prog_exit, prog))
@@ -1458,12 +1458,13 @@ static int invoke_bpf(const struct btf_func_model *m, u8 **pprog,
*/
int arch_prepare_bpf_trampoline(void *image, void *image_end,
const struct btf_func_model *m, u32 flags,
- struct bpf_prog **fentry_progs, int fentry_cnt,
- struct bpf_prog **fexit_progs, int fexit_cnt,
+ struct bpf_tramp_progs *tprogs,
void *orig_call)
{
int cnt = 0, nr_args = m->nr_args;
int stack_size = nr_args * 8;
+ struct bpf_tramp_progs *fentry = &tprogs[BPF_TRAMP_FENTRY];
+ struct bpf_tramp_progs *fexit = &tprogs[BPF_TRAMP_FEXIT];
u8 *prog;

/* x86-64 supports up to 6 arguments. 7+ can be added in the future */
@@ -1492,12 +1493,12 @@ int arch_prepare_bpf_trampoline(void *image, void *image_end,

save_regs(m, &prog, nr_args, stack_size);

- if (fentry_cnt)
- if (invoke_bpf(m, &prog, fentry_progs, fentry_cnt, stack_size))
+ if (fentry->nr_progs)
+ if (invoke_bpf(m, &prog, fentry, stack_size))
return -EINVAL;

if (flags & BPF_TRAMP_F_CALL_ORIG) {
- if (fentry_cnt)
+ if (fentry->nr_progs)
restore_regs(m, &prog, nr_args, stack_size);

/* call original function */
@@ -1507,8 +1508,8 @@ int arch_prepare_bpf_trampoline(void *image, void *image_end,
emit_stx(&prog, BPF_DW, BPF_REG_FP, BPF_REG_0, -8);
}

- if (fexit_cnt)
- if (invoke_bpf(m, &prog, fexit_progs, fexit_cnt, stack_size))
+ if (fexit->nr_progs)
+ if (invoke_bpf(m, &prog, fexit, stack_size))
return -EINVAL;

if (flags & BPF_TRAMP_F_RESTORE_REGS)
diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index f13c78c6f29d..98ec10b23dbb 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -433,6 +433,16 @@ struct btf_func_model {
*/
#define BPF_TRAMP_F_SKIP_FRAME BIT(2)

+/* Each call __bpf_prog_enter + call bpf_func + call __bpf_prog_exit is ~50
+ * bytes on x86. Pick a number to fit into BPF_IMAGE_SIZE / 2
+ */
+#define BPF_MAX_TRAMP_PROGS 40
+
+struct bpf_tramp_progs {
+ struct bpf_prog *progs[BPF_MAX_TRAMP_PROGS];
+ int nr_progs;
+};
+
/* Different use cases for BPF trampoline:
* 1. replace nop at the function entry (kprobe equivalent)
* flags = BPF_TRAMP_F_RESTORE_REGS
@@ -455,8 +465,7 @@ struct btf_func_model {
*/
int arch_prepare_bpf_trampoline(void *image, void *image_end,
const struct btf_func_model *m, u32 flags,
- struct bpf_prog **fentry_progs, int fentry_cnt,
- struct bpf_prog **fexit_progs, int fexit_cnt,
+ struct bpf_tramp_progs *tprogs,
void *orig_call);
/* these two functions are called from generated trampoline */
u64 notrace __bpf_prog_enter(void);
diff --git a/kernel/bpf/bpf_struct_ops.c b/kernel/bpf/bpf_struct_ops.c
index c498f0fffb40..a011a77b21fa 100644
--- a/kernel/bpf/bpf_struct_ops.c
+++ b/kernel/bpf/bpf_struct_ops.c
@@ -320,6 +320,7 @@ static int bpf_struct_ops_map_update_elem(struct bpf_map *map, void *key,
struct bpf_struct_ops_value *uvalue, *kvalue;
const struct btf_member *member;
const struct btf_type *t = st_ops->type;
+ struct bpf_tramp_progs *tprogs = NULL;
void *udata, *kdata;
int prog_fd, err = 0;
void *image;
@@ -425,10 +426,19 @@ static int bpf_struct_ops_map_update_elem(struct bpf_map *map, void *key,
goto reset_unlock;
}

+ tprogs = kcalloc(BPF_TRAMP_MAX, sizeof(struct bpf_tramp_progs),
+ GFP_KERNEL);
+ if (!tprogs) {
+ err = -ENOMEM;
+ goto reset_unlock;
+ }
+
+ *tprogs[BPF_TRAMP_FENTRY].progs = prog;
+ tprogs[BPF_TRAMP_FENTRY].nr_progs = 1;
err = arch_prepare_bpf_trampoline(image,
st_map->image + PAGE_SIZE,
&st_ops->func_models[i], 0,
- &prog, 1, NULL, 0, NULL);
+ tprogs, NULL);
if (err < 0)
goto reset_unlock;

@@ -469,6 +479,7 @@ static int bpf_struct_ops_map_update_elem(struct bpf_map *map, void *key,
memset(uvalue, 0, map->value_size);
memset(kvalue, 0, map->value_size);
unlock:
+ kfree(tprogs);
mutex_unlock(&st_map->lock);
return err;
}
diff --git a/kernel/bpf/trampoline.c b/kernel/bpf/trampoline.c
index 704fa787fec0..9daeb094f054 100644
--- a/kernel/bpf/trampoline.c
+++ b/kernel/bpf/trampoline.c
@@ -190,40 +190,50 @@ static int register_fentry(struct bpf_trampoline *tr, void *new_addr)
return ret;
}

-/* Each call __bpf_prog_enter + call bpf_func + call __bpf_prog_exit is ~50
- * bytes on x86. Pick a number to fit into BPF_IMAGE_SIZE / 2
- */
-#define BPF_MAX_TRAMP_PROGS 40
+static struct bpf_tramp_progs *
+bpf_trampoline_update_progs(struct bpf_trampoline *tr, int *total)
+{
+ struct bpf_tramp_progs *tprogs;
+ struct bpf_prog **progs;
+ struct bpf_prog_aux *aux;
+ int kind;
+
+ *total = 0;
+ tprogs = kcalloc(BPF_TRAMP_MAX, sizeof(struct bpf_tramp_progs),
+ GFP_KERNEL);
+ if (!tprogs)
+ return ERR_PTR(-ENOMEM);
+
+ for (kind = 0; kind < BPF_TRAMP_MAX; kind++) {
+ tprogs[kind].nr_progs = tr->progs_cnt[kind];
+ *total += tr->progs_cnt[kind];
+ progs = tprogs[kind].progs;
+
+ hlist_for_each_entry(aux, &tr->progs_hlist[kind], tramp_hlist)
+ *progs++ = aux->prog;
+ }
+ return tprogs;
+}

static int bpf_trampoline_update(struct bpf_trampoline *tr)
{
void *old_image = tr->image + ((tr->selector + 1) & 1) * BPF_IMAGE_SIZE/2;
void *new_image = tr->image + (tr->selector & 1) * BPF_IMAGE_SIZE/2;
- struct bpf_prog *progs_to_run[BPF_MAX_TRAMP_PROGS];
- int fentry_cnt = tr->progs_cnt[BPF_TRAMP_FENTRY];
- int fexit_cnt = tr->progs_cnt[BPF_TRAMP_FEXIT];
- struct bpf_prog **progs, **fentry, **fexit;
+ struct bpf_tramp_progs *tprogs;
u32 flags = BPF_TRAMP_F_RESTORE_REGS;
- struct bpf_prog_aux *aux;
- int err;
+ int err, total;
+
+ tprogs = bpf_trampoline_update_progs(tr, &total);
+ if (IS_ERR(tprogs))
+ return PTR_ERR(tprogs);

- if (fentry_cnt + fexit_cnt == 0) {
+ if (total == 0) {
err = unregister_fentry(tr, old_image);
tr->selector = 0;
goto out;
}

- /* populate fentry progs */
- fentry = progs = progs_to_run;
- hlist_for_each_entry(aux, &tr->progs_hlist[BPF_TRAMP_FENTRY], tramp_hlist)
- *progs++ = aux->prog;
-
- /* populate fexit progs */
- fexit = progs;
- hlist_for_each_entry(aux, &tr->progs_hlist[BPF_TRAMP_FEXIT], tramp_hlist)
- *progs++ = aux->prog;
-
- if (fexit_cnt)
+ if (tprogs[BPF_TRAMP_FEXIT].nr_progs)
flags = BPF_TRAMP_F_CALL_ORIG | BPF_TRAMP_F_SKIP_FRAME;

/* Though the second half of trampoline page is unused a task could be
@@ -232,12 +242,11 @@ static int bpf_trampoline_update(struct bpf_trampoline *tr)
* preempted task. Hence wait for tasks to voluntarily schedule or go
* to userspace.
*/
+
synchronize_rcu_tasks();

err = arch_prepare_bpf_trampoline(new_image, new_image + BPF_IMAGE_SIZE / 2,
- &tr->func.model, flags,
- fentry, fentry_cnt,
- fexit, fexit_cnt,
+ &tr->func.model, flags, tprogs,
tr->func.addr);
if (err < 0)
goto out;
@@ -252,6 +261,7 @@ static int bpf_trampoline_update(struct bpf_trampoline *tr)
goto out;
tr->selector++;
out:
+ kfree(tprogs);
return err;
}

@@ -409,8 +419,7 @@ void notrace __bpf_prog_exit(struct bpf_prog *prog, u64 start)
int __weak
arch_prepare_bpf_trampoline(void *image, void *image_end,
const struct btf_func_model *m, u32 flags,
- struct bpf_prog **fentry_progs, int fentry_cnt,
- struct bpf_prog **fexit_progs, int fexit_cnt,
+ struct bpf_tramp_progs *tprogs,
void *orig_call)
{
return -ENOTSUPP;
--
2.20.1

2020-03-03 14:14:32

by KP Singh

[permalink] [raw]
Subject: [PATCH bpf-next 6/7] bpf: Add test ops for BPF_PROG_TYPE_TRACING

From: KP Singh <[email protected]>

The current fexit and fentry tests rely on a different program to
exercise the functions they attach to. Instead of doing this, implement
the test operations for tracing which will also be used for
BPF_OVERRIDE_RETURN in a subsequent patch.

Also, clean up the fexit test to use the generated skeleton.

Signed-off-by: KP Singh <[email protected]>
---
include/linux/bpf.h | 10 +++
kernel/trace/bpf_trace.c | 1 +
net/bpf/test_run.c | 38 +++++++---
.../selftests/bpf/prog_tests/fentry_fexit.c | 12 +---
.../selftests/bpf/prog_tests/fentry_test.c | 14 ++--
.../selftests/bpf/prog_tests/fexit_test.c | 69 ++++++-------------
6 files changed, 68 insertions(+), 76 deletions(-)

diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index 3cfdc216a2f4..c00919025532 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -1156,6 +1156,9 @@ int bpf_prog_test_run_xdp(struct bpf_prog *prog, const union bpf_attr *kattr,
union bpf_attr __user *uattr);
int bpf_prog_test_run_skb(struct bpf_prog *prog, const union bpf_attr *kattr,
union bpf_attr __user *uattr);
+int bpf_prog_test_run_tracing(struct bpf_prog *prog,
+ const union bpf_attr *kattr,
+ union bpf_attr __user *uattr);
int bpf_prog_test_run_flow_dissector(struct bpf_prog *prog,
const union bpf_attr *kattr,
union bpf_attr __user *uattr);
@@ -1313,6 +1316,13 @@ static inline int bpf_prog_test_run_skb(struct bpf_prog *prog,
return -ENOTSUPP;
}

+static inline int bpf_prog_test_run_tracing(struct bpf_prog *prog,
+ const union bpf_attr *kattr,
+ union bpf_attr __user *uattr)
+{
+ return -ENOTSUPP;
+}
+
static inline int bpf_prog_test_run_flow_dissector(struct bpf_prog *prog,
const union bpf_attr *kattr,
union bpf_attr __user *uattr)
diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
index 07764c761073..363e0a2c75cf 100644
--- a/kernel/trace/bpf_trace.c
+++ b/kernel/trace/bpf_trace.c
@@ -1266,6 +1266,7 @@ const struct bpf_verifier_ops tracing_verifier_ops = {
};

const struct bpf_prog_ops tracing_prog_ops = {
+ .test_run = bpf_prog_test_run_tracing,
};

static bool raw_tp_writable_prog_is_valid_access(int off, int size,
diff --git a/net/bpf/test_run.c b/net/bpf/test_run.c
index 562443f94133..fb54b45285b4 100644
--- a/net/bpf/test_run.c
+++ b/net/bpf/test_run.c
@@ -160,18 +160,38 @@ static void *bpf_test_init(const union bpf_attr *kattr, u32 size,
kfree(data);
return ERR_PTR(-EFAULT);
}
- if (bpf_fentry_test1(1) != 2 ||
- bpf_fentry_test2(2, 3) != 5 ||
- bpf_fentry_test3(4, 5, 6) != 15 ||
- bpf_fentry_test4((void *)7, 8, 9, 10) != 34 ||
- bpf_fentry_test5(11, (void *)12, 13, 14, 15) != 65 ||
- bpf_fentry_test6(16, (void *)17, 18, 19, (void *)20, 21) != 111) {
- kfree(data);
- return ERR_PTR(-EFAULT);
- }
+
return data;
}

+int bpf_prog_test_run_tracing(struct bpf_prog *prog,
+ const union bpf_attr *kattr,
+ union bpf_attr __user *uattr)
+{
+ int err = -EFAULT;
+
+ switch (prog->expected_attach_type) {
+ case BPF_TRACE_FENTRY:
+ case BPF_TRACE_FEXIT:
+ if (bpf_fentry_test1(1) != 2 ||
+ bpf_fentry_test2(2, 3) != 5 ||
+ bpf_fentry_test3(4, 5, 6) != 15 ||
+ bpf_fentry_test4((void *)7, 8, 9, 10) != 34 ||
+ bpf_fentry_test5(11, (void *)12, 13, 14, 15) != 65 ||
+ bpf_fentry_test6(16, (void *)17, 18, 19, (void *)20, 21) != 111)
+ goto out;
+ break;
+ default:
+ goto out;
+ }
+
+ return 0;
+
+out:
+ trace_bpf_test_finish(&err);
+ return err;
+}
+
static void *bpf_ctx_init(const union bpf_attr *kattr, u32 max_size)
{
void __user *data_in = u64_to_user_ptr(kattr->test.ctx_in);
diff --git a/tools/testing/selftests/bpf/prog_tests/fentry_fexit.c b/tools/testing/selftests/bpf/prog_tests/fentry_fexit.c
index 235ac4f67f5b..83493bd5745c 100644
--- a/tools/testing/selftests/bpf/prog_tests/fentry_fexit.c
+++ b/tools/testing/selftests/bpf/prog_tests/fentry_fexit.c
@@ -1,22 +1,17 @@
// SPDX-License-Identifier: GPL-2.0
/* Copyright (c) 2019 Facebook */
#include <test_progs.h>
-#include "test_pkt_access.skel.h"
#include "fentry_test.skel.h"
#include "fexit_test.skel.h"

void test_fentry_fexit(void)
{
- struct test_pkt_access *pkt_skel = NULL;
struct fentry_test *fentry_skel = NULL;
struct fexit_test *fexit_skel = NULL;
__u64 *fentry_res, *fexit_res;
__u32 duration = 0, retval;
- int err, pkt_fd, i;
+ int err, prog_fd, i;

- pkt_skel = test_pkt_access__open_and_load();
- if (CHECK(!pkt_skel, "pkt_skel_load", "pkt_access skeleton failed\n"))
- return;
fentry_skel = fentry_test__open_and_load();
if (CHECK(!fentry_skel, "fentry_skel_load", "fentry skeleton failed\n"))
goto close_prog;
@@ -31,8 +26,8 @@ void test_fentry_fexit(void)
if (CHECK(err, "fexit_attach", "fexit attach failed: %d\n", err))
goto close_prog;

- pkt_fd = bpf_program__fd(pkt_skel->progs.test_pkt_access);
- err = bpf_prog_test_run(pkt_fd, 1, &pkt_v6, sizeof(pkt_v6),
+ prog_fd = bpf_program__fd(fexit_skel->progs.test1);
+ err = bpf_prog_test_run(prog_fd, 1, NULL, 0,
NULL, NULL, &retval, &duration);
CHECK(err || retval, "ipv6",
"err %d errno %d retval %d duration %d\n",
@@ -49,7 +44,6 @@ void test_fentry_fexit(void)
}

close_prog:
- test_pkt_access__destroy(pkt_skel);
fentry_test__destroy(fentry_skel);
fexit_test__destroy(fexit_skel);
}
diff --git a/tools/testing/selftests/bpf/prog_tests/fentry_test.c b/tools/testing/selftests/bpf/prog_tests/fentry_test.c
index 5cc06021f27d..04ebbf1cb390 100644
--- a/tools/testing/selftests/bpf/prog_tests/fentry_test.c
+++ b/tools/testing/selftests/bpf/prog_tests/fentry_test.c
@@ -1,20 +1,15 @@
// SPDX-License-Identifier: GPL-2.0
/* Copyright (c) 2019 Facebook */
#include <test_progs.h>
-#include "test_pkt_access.skel.h"
#include "fentry_test.skel.h"

void test_fentry_test(void)
{
- struct test_pkt_access *pkt_skel = NULL;
struct fentry_test *fentry_skel = NULL;
- int err, pkt_fd, i;
+ int err, prog_fd, i;
__u32 duration = 0, retval;
__u64 *result;

- pkt_skel = test_pkt_access__open_and_load();
- if (CHECK(!pkt_skel, "pkt_skel_load", "pkt_access skeleton failed\n"))
- return;
fentry_skel = fentry_test__open_and_load();
if (CHECK(!fentry_skel, "fentry_skel_load", "fentry skeleton failed\n"))
goto cleanup;
@@ -23,10 +18,10 @@ void test_fentry_test(void)
if (CHECK(err, "fentry_attach", "fentry attach failed: %d\n", err))
goto cleanup;

- pkt_fd = bpf_program__fd(pkt_skel->progs.test_pkt_access);
- err = bpf_prog_test_run(pkt_fd, 1, &pkt_v6, sizeof(pkt_v6),
+ prog_fd = bpf_program__fd(fentry_skel->progs.test1);
+ err = bpf_prog_test_run(prog_fd, 1, NULL, 0,
NULL, NULL, &retval, &duration);
- CHECK(err || retval, "ipv6",
+ CHECK(err || retval, "test_run",
"err %d errno %d retval %d duration %d\n",
err, errno, retval, duration);

@@ -39,5 +34,4 @@ void test_fentry_test(void)

cleanup:
fentry_test__destroy(fentry_skel);
- test_pkt_access__destroy(pkt_skel);
}
diff --git a/tools/testing/selftests/bpf/prog_tests/fexit_test.c b/tools/testing/selftests/bpf/prog_tests/fexit_test.c
index d2c3655dd7a3..78d7a2765c27 100644
--- a/tools/testing/selftests/bpf/prog_tests/fexit_test.c
+++ b/tools/testing/selftests/bpf/prog_tests/fexit_test.c
@@ -1,64 +1,37 @@
// SPDX-License-Identifier: GPL-2.0
/* Copyright (c) 2019 Facebook */
#include <test_progs.h>
+#include "fexit_test.skel.h"

void test_fexit_test(void)
{
- struct bpf_prog_load_attr attr = {
- .file = "./fexit_test.o",
- };
-
- char prog_name[] = "fexit/bpf_fentry_testX";
- struct bpf_object *obj = NULL, *pkt_obj;
- int err, pkt_fd, kfree_skb_fd, i;
- struct bpf_link *link[6] = {};
- struct bpf_program *prog[6];
+ struct fexit_test *fexit_skel = NULL;
+ int err, prog_fd, i;
__u32 duration = 0, retval;
- struct bpf_map *data_map;
- const int zero = 0;
- u64 result[6];
+ __u64 *result;

- err = bpf_prog_load("./test_pkt_access.o", BPF_PROG_TYPE_SCHED_CLS,
- &pkt_obj, &pkt_fd);
- if (CHECK(err, "prog_load sched cls", "err %d errno %d\n", err, errno))
- return;
- err = bpf_prog_load_xattr(&attr, &obj, &kfree_skb_fd);
- if (CHECK(err, "prog_load fail", "err %d errno %d\n", err, errno))
- goto close_prog;
+ fexit_skel = fexit_test__open_and_load();
+ if (CHECK(!fexit_skel, "fexit_skel_load", "fexit skeleton failed\n"))
+ goto cleanup;

- for (i = 0; i < 6; i++) {
- prog_name[sizeof(prog_name) - 2] = '1' + i;
- prog[i] = bpf_object__find_program_by_title(obj, prog_name);
- if (CHECK(!prog[i], "find_prog", "prog %s not found\n", prog_name))
- goto close_prog;
- link[i] = bpf_program__attach_trace(prog[i]);
- if (CHECK(IS_ERR(link[i]), "attach_trace", "failed to link\n"))
- goto close_prog;
- }
- data_map = bpf_object__find_map_by_name(obj, "fexit_te.bss");
- if (CHECK(!data_map, "find_data_map", "data map not found\n"))
- goto close_prog;
+ err = fexit_test__attach(fexit_skel);
+ if (CHECK(err, "fexit_attach", "fexit attach failed: %d\n", err))
+ goto cleanup;

- err = bpf_prog_test_run(pkt_fd, 1, &pkt_v6, sizeof(pkt_v6),
+ prog_fd = bpf_program__fd(fexit_skel->progs.test1);
+ err = bpf_prog_test_run(prog_fd, 1, NULL, 0,
NULL, NULL, &retval, &duration);
- CHECK(err || retval, "ipv6",
+ CHECK(err || retval, "test_run",
"err %d errno %d retval %d duration %d\n",
err, errno, retval, duration);

- err = bpf_map_lookup_elem(bpf_map__fd(data_map), &zero, &result);
- if (CHECK(err, "get_result",
- "failed to get output data: %d\n", err))
- goto close_prog;
-
- for (i = 0; i < 6; i++)
- if (CHECK(result[i] != 1, "result", "bpf_fentry_test%d failed err %ld\n",
- i + 1, result[i]))
- goto close_prog;
+ result = (__u64 *)fexit_skel->bss;
+ for (i = 0; i < 6; i++) {
+ if (CHECK(result[i] != 1, "result",
+ "fexit_test%d failed err %lld\n", i + 1, result[i]))
+ goto cleanup;
+ }

-close_prog:
- for (i = 0; i < 6; i++)
- if (!IS_ERR_OR_NULL(link[i]))
- bpf_link__destroy(link[i]);
- bpf_object__close(obj);
- bpf_object__close(pkt_obj);
+cleanup:
+ fexit_test__destroy(fexit_skel);
}
--
2.20.1

2020-03-03 22:14:29

by Andrii Nakryiko

[permalink] [raw]
Subject: Re: [PATCH bpf-next 1/7] bpf: Refactor trampoline update code

On Tue, Mar 3, 2020 at 6:13 AM KP Singh <[email protected]> wrote:
>
> From: KP Singh <[email protected]>
>
> As we need to introduce a third type of attachment for trampolines, the
> flattened signature of arch_prepare_bpf_trampoline gets even more
> complicated.
>
> Refactor the prog and count argument to arch_prepare_bpf_trampoline to
> use bpf_tramp_progs to simplify the addition and accounting for new
> attachment types.
>
> Signed-off-by: KP Singh <[email protected]>
> ---
> arch/x86/net/bpf_jit_comp.c | 31 +++++++++---------
> include/linux/bpf.h | 13 ++++++--
> kernel/bpf/bpf_struct_ops.c | 13 +++++++-
> kernel/bpf/trampoline.c | 63 +++++++++++++++++++++----------------
> 4 files changed, 75 insertions(+), 45 deletions(-)
>
> diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c
> index 9ba08e9abc09..15c7d28bc05c 100644
> --- a/arch/x86/net/bpf_jit_comp.c
> +++ b/arch/x86/net/bpf_jit_comp.c
> @@ -1362,12 +1362,12 @@ static void restore_regs(const struct btf_func_model *m, u8 **prog, int nr_args,
> }
>
> static int invoke_bpf(const struct btf_func_model *m, u8 **pprog,
> - struct bpf_prog **progs, int prog_cnt, int stack_size)
> + struct bpf_tramp_progs *tp, int stack_size)

nit: it's `tp` here, but `tprogs` in arch_prepare_bpf_trampoline. It's
minor, but would be nice to stick to consistent naming.

> {
> u8 *prog = *pprog;
> int cnt = 0, i;
>
> - for (i = 0; i < prog_cnt; i++) {
> + for (i = 0; i < tp->nr_progs; i++) {
> if (emit_call(&prog, __bpf_prog_enter, prog))
> return -EINVAL;
> /* remember prog start time returned by __bpf_prog_enter */
> @@ -1376,17 +1376,17 @@ static int invoke_bpf(const struct btf_func_model *m, u8 **pprog,
> /* arg1: lea rdi, [rbp - stack_size] */
> EMIT4(0x48, 0x8D, 0x7D, -stack_size);
> /* arg2: progs[i]->insnsi for interpreter */
> - if (!progs[i]->jited)
> + if (!tp->progs[i]->jited)
> emit_mov_imm64(&prog, BPF_REG_2,
> - (long) progs[i]->insnsi >> 32,
> - (u32) (long) progs[i]->insnsi);
> + (long) tp->progs[i]->insnsi >> 32,
> + (u32) (long) tp->progs[i]->insnsi);
> /* call JITed bpf program or interpreter */
> - if (emit_call(&prog, progs[i]->bpf_func, prog))
> + if (emit_call(&prog, tp->progs[i]->bpf_func, prog))
> return -EINVAL;
>
> /* arg1: mov rdi, progs[i] */
> - emit_mov_imm64(&prog, BPF_REG_1, (long) progs[i] >> 32,
> - (u32) (long) progs[i]);
> + emit_mov_imm64(&prog, BPF_REG_1, (long) tp->progs[i] >> 32,
> + (u32) (long) tp->progs[i]);
> /* arg2: mov rsi, rbx <- start time in nsec */
> emit_mov_reg(&prog, true, BPF_REG_2, BPF_REG_6);
> if (emit_call(&prog, __bpf_prog_exit, prog))
> @@ -1458,12 +1458,13 @@ static int invoke_bpf(const struct btf_func_model *m, u8 **pprog,
> */
> int arch_prepare_bpf_trampoline(void *image, void *image_end,
> const struct btf_func_model *m, u32 flags,
> - struct bpf_prog **fentry_progs, int fentry_cnt,
> - struct bpf_prog **fexit_progs, int fexit_cnt,
> + struct bpf_tramp_progs *tprogs,
> void *orig_call)
> {
> int cnt = 0, nr_args = m->nr_args;
> int stack_size = nr_args * 8;
> + struct bpf_tramp_progs *fentry = &tprogs[BPF_TRAMP_FENTRY];
> + struct bpf_tramp_progs *fexit = &tprogs[BPF_TRAMP_FEXIT];
> u8 *prog;
>
> /* x86-64 supports up to 6 arguments. 7+ can be added in the future */

[...]

> diff --git a/kernel/bpf/bpf_struct_ops.c b/kernel/bpf/bpf_struct_ops.c
> index c498f0fffb40..a011a77b21fa 100644
> --- a/kernel/bpf/bpf_struct_ops.c
> +++ b/kernel/bpf/bpf_struct_ops.c
> @@ -320,6 +320,7 @@ static int bpf_struct_ops_map_update_elem(struct bpf_map *map, void *key,
> struct bpf_struct_ops_value *uvalue, *kvalue;
> const struct btf_member *member;
> const struct btf_type *t = st_ops->type;
> + struct bpf_tramp_progs *tprogs = NULL;
> void *udata, *kdata;
> int prog_fd, err = 0;
> void *image;
> @@ -425,10 +426,19 @@ static int bpf_struct_ops_map_update_elem(struct bpf_map *map, void *key,
> goto reset_unlock;
> }
>
> + tprogs = kcalloc(BPF_TRAMP_MAX, sizeof(struct bpf_tramp_progs),

nit: sizeof(*tprogs) ?

> + GFP_KERNEL);
> + if (!tprogs) {
> + err = -ENOMEM;
> + goto reset_unlock;
> + }
> +
> + *tprogs[BPF_TRAMP_FENTRY].progs = prog;

I'm very confused what's going on here, why * at the beginning here,
but no * below?.. It seems unnecessary.

> + tprogs[BPF_TRAMP_FENTRY].nr_progs = 1;
> err = arch_prepare_bpf_trampoline(image,
> st_map->image + PAGE_SIZE,
> &st_ops->func_models[i], 0,
> - &prog, 1, NULL, 0, NULL);
> + tprogs, NULL);
> if (err < 0)
> goto reset_unlock;
>
> @@ -469,6 +479,7 @@ static int bpf_struct_ops_map_update_elem(struct bpf_map *map, void *key,
> memset(uvalue, 0, map->value_size);
> memset(kvalue, 0, map->value_size);
> unlock:
> + kfree(tprogs);
> mutex_unlock(&st_map->lock);
> return err;
> }
> diff --git a/kernel/bpf/trampoline.c b/kernel/bpf/trampoline.c
> index 704fa787fec0..9daeb094f054 100644
> --- a/kernel/bpf/trampoline.c
> +++ b/kernel/bpf/trampoline.c
> @@ -190,40 +190,50 @@ static int register_fentry(struct bpf_trampoline *tr, void *new_addr)
> return ret;
> }
>
> -/* Each call __bpf_prog_enter + call bpf_func + call __bpf_prog_exit is ~50
> - * bytes on x86. Pick a number to fit into BPF_IMAGE_SIZE / 2
> - */
> -#define BPF_MAX_TRAMP_PROGS 40
> +static struct bpf_tramp_progs *
> +bpf_trampoline_update_progs(struct bpf_trampoline *tr, int *total)
> +{
> + struct bpf_tramp_progs *tprogs;
> + struct bpf_prog **progs;
> + struct bpf_prog_aux *aux;
> + int kind;
> +
> + *total = 0;
> + tprogs = kcalloc(BPF_TRAMP_MAX, sizeof(struct bpf_tramp_progs),

same nit as above, sizeof(*tprogs) is shorter and less error-prone

> + GFP_KERNEL);
> + if (!tprogs)
> + return ERR_PTR(-ENOMEM);
> +
> + for (kind = 0; kind < BPF_TRAMP_MAX; kind++) {
> + tprogs[kind].nr_progs = tr->progs_cnt[kind];
> + *total += tr->progs_cnt[kind];
> + progs = tprogs[kind].progs;
> +
> + hlist_for_each_entry(aux, &tr->progs_hlist[kind], tramp_hlist)
> + *progs++ = aux->prog;
> + }
> + return tprogs;
> +}
>

[...]

2020-03-03 22:14:33

by Andrii Nakryiko

[permalink] [raw]
Subject: Re: [PATCH bpf-next 0/7] Introduce BPF_MODIFY_RET tracing progs.

On Tue, Mar 3, 2020 at 6:12 AM KP Singh <[email protected]> wrote:
>
> From: KP Singh <[email protected]>
>
> This was brought up in the KRSI v4 discussion and found to be useful
> both for security and tracing programs.
>
> https://lore.kernel.org/bpf/[email protected]/
>
> The modify_return programs are allowed for security hooks (with an
> extra CAP_MAC_ADMIN check) and functions whitelisted for error
> injection (ALLOW_ERROR_INJECTION).
>
> The "security_" check is expected to be cleaned up with the KRSI patch
> series.
>
> Here is an example of how a fmod_ret program behaves:
>
> int func_to_be_attached(int a, int b)
> { <--- do_fentry
>
> do_fmod_ret:
> <update ret by calling fmod_ret>
> if (ret != 0)
> goto do_fexit;
>
> original_function:
>
> <side_effects_happen_here>
>
> } <--- do_fexit
>
> ALLOW_ERROR_INJECTION(func_to_be_attached, ERRNO)
>
> The fmod_ret program attached to this function can be defined as:
>
> SEC("fmod_ret/func_to_be_attached")
> BPF_PROG(func_name, int a, int b, int ret)

nit: int BPF_PROG(...)


> {
> // This will skip the original function logic.
> return -1;
> }
>
> KP Singh (7):
> bpf: Refactor trampoline update code
> bpf: JIT helpers for fmod_ret progs
> bpf: Introduce BPF_MODIFY_RETURN
> bpf: Attachment verification for BPF_MODIFY_RETURN
> tools/libbpf: Add support for BPF_MODIFY_RETURN
> bpf: Add test ops for BPF_PROG_TYPE_TRACING
> bpf: Add selftests for BPF_MODIFY_RETURN
>
> arch/x86/net/bpf_jit_comp.c | 261 +++++++++++++-----
> include/linux/bpf.h | 24 +-
> include/uapi/linux/bpf.h | 1 +
> kernel/bpf/bpf_struct_ops.c | 13 +-
> kernel/bpf/btf.c | 27 +-
> kernel/bpf/syscall.c | 1 +
> kernel/bpf/trampoline.c | 66 +++--
> kernel/bpf/verifier.c | 32 +++
> kernel/trace/bpf_trace.c | 1 +
> net/bpf/test_run.c | 57 +++-
> tools/include/uapi/linux/bpf.h | 1 +
> tools/lib/bpf/libbpf.c | 4 +
> .../selftests/bpf/prog_tests/fentry_fexit.c | 12 +-
> .../selftests/bpf/prog_tests/fentry_test.c | 14 +-
> .../selftests/bpf/prog_tests/fexit_test.c | 69 ++---
> .../selftests/bpf/prog_tests/modify_return.c | 65 +++++
> .../selftests/bpf/progs/modify_return.c | 49 ++++
> 17 files changed, 509 insertions(+), 188 deletions(-)
> create mode 100644 tools/testing/selftests/bpf/prog_tests/modify_return.c
> create mode 100644 tools/testing/selftests/bpf/progs/modify_return.c
>
> --
> 2.20.1
>

2020-03-03 22:25:03

by KP Singh

[permalink] [raw]
Subject: Re: [PATCH bpf-next 1/7] bpf: Refactor trampoline update code

On 03-M?r 14:12, Andrii Nakryiko wrote:
> On Tue, Mar 3, 2020 at 6:13 AM KP Singh <[email protected]> wrote:
> >
> > From: KP Singh <[email protected]>
> >
> > As we need to introduce a third type of attachment for trampolines, the
> > flattened signature of arch_prepare_bpf_trampoline gets even more
> > complicated.
> >
> > Refactor the prog and count argument to arch_prepare_bpf_trampoline to
> > use bpf_tramp_progs to simplify the addition and accounting for new
> > attachment types.
> >
> > Signed-off-by: KP Singh <[email protected]>
> > ---
> > arch/x86/net/bpf_jit_comp.c | 31 +++++++++---------
> > include/linux/bpf.h | 13 ++++++--
> > kernel/bpf/bpf_struct_ops.c | 13 +++++++-
> > kernel/bpf/trampoline.c | 63 +++++++++++++++++++++----------------
> > 4 files changed, 75 insertions(+), 45 deletions(-)
> >
> > diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c
> > index 9ba08e9abc09..15c7d28bc05c 100644
> > --- a/arch/x86/net/bpf_jit_comp.c
> > +++ b/arch/x86/net/bpf_jit_comp.c
> > @@ -1362,12 +1362,12 @@ static void restore_regs(const struct btf_func_model *m, u8 **prog, int nr_args,
> > }
> >
> > static int invoke_bpf(const struct btf_func_model *m, u8 **pprog,
> > - struct bpf_prog **progs, int prog_cnt, int stack_size)
> > + struct bpf_tramp_progs *tp, int stack_size)
>
> nit: it's `tp` here, but `tprogs` in arch_prepare_bpf_trampoline. It's
> minor, but would be nice to stick to consistent naming.

I did this to ~distinguish~ that rather than being an array of
tprogs it's a pointer to one of its members e.g.
&tprogs[BPF_TRAMP_FEXIT]).

I change it if you feel this is not a valuable disntinction.

>
> > {
> > u8 *prog = *pprog;
> > int cnt = 0, i;
> >
> > - for (i = 0; i < prog_cnt; i++) {
> > + for (i = 0; i < tp->nr_progs; i++) {
> > if (emit_call(&prog, __bpf_prog_enter, prog))
> > return -EINVAL;
> > /* remember prog start time returned by __bpf_prog_enter */
> > @@ -1376,17 +1376,17 @@ static int invoke_bpf(const struct btf_func_model *m, u8 **pprog,
> > /* arg1: lea rdi, [rbp - stack_size] */
> > EMIT4(0x48, 0x8D, 0x7D, -stack_size);
> > /* arg2: progs[i]->insnsi for interpreter */
> > - if (!progs[i]->jited)
> > + if (!tp->progs[i]->jited)
> > emit_mov_imm64(&prog, BPF_REG_2,
> > - (long) progs[i]->insnsi >> 32,
> > - (u32) (long) progs[i]->insnsi);
> > + (long) tp->progs[i]->insnsi >> 32,
> > + (u32) (long) tp->progs[i]->insnsi);
> > /* call JITed bpf program or interpreter */
> > - if (emit_call(&prog, progs[i]->bpf_func, prog))
> > + if (emit_call(&prog, tp->progs[i]->bpf_func, prog))
> > return -EINVAL;
> >
> > /* arg1: mov rdi, progs[i] */
> > - emit_mov_imm64(&prog, BPF_REG_1, (long) progs[i] >> 32,
> > - (u32) (long) progs[i]);
> > + emit_mov_imm64(&prog, BPF_REG_1, (long) tp->progs[i] >> 32,
> > + (u32) (long) tp->progs[i]);
> > /* arg2: mov rsi, rbx <- start time in nsec */
> > emit_mov_reg(&prog, true, BPF_REG_2, BPF_REG_6);
> > if (emit_call(&prog, __bpf_prog_exit, prog))
> > @@ -1458,12 +1458,13 @@ static int invoke_bpf(const struct btf_func_model *m, u8 **pprog,
> > */
> > int arch_prepare_bpf_trampoline(void *image, void *image_end,
> > const struct btf_func_model *m, u32 flags,
> > - struct bpf_prog **fentry_progs, int fentry_cnt,
> > - struct bpf_prog **fexit_progs, int fexit_cnt,
> > + struct bpf_tramp_progs *tprogs,
> > void *orig_call)
> > {
> > int cnt = 0, nr_args = m->nr_args;
> > int stack_size = nr_args * 8;
> > + struct bpf_tramp_progs *fentry = &tprogs[BPF_TRAMP_FENTRY];
> > + struct bpf_tramp_progs *fexit = &tprogs[BPF_TRAMP_FEXIT];
> > u8 *prog;
> >
> > /* x86-64 supports up to 6 arguments. 7+ can be added in the future */
>
> [...]
>
> > diff --git a/kernel/bpf/bpf_struct_ops.c b/kernel/bpf/bpf_struct_ops.c
> > index c498f0fffb40..a011a77b21fa 100644
> > --- a/kernel/bpf/bpf_struct_ops.c
> > +++ b/kernel/bpf/bpf_struct_ops.c
> > @@ -320,6 +320,7 @@ static int bpf_struct_ops_map_update_elem(struct bpf_map *map, void *key,
> > struct bpf_struct_ops_value *uvalue, *kvalue;
> > const struct btf_member *member;
> > const struct btf_type *t = st_ops->type;
> > + struct bpf_tramp_progs *tprogs = NULL;
> > void *udata, *kdata;
> > int prog_fd, err = 0;
> > void *image;
> > @@ -425,10 +426,19 @@ static int bpf_struct_ops_map_update_elem(struct bpf_map *map, void *key,
> > goto reset_unlock;
> > }
> >
> > + tprogs = kcalloc(BPF_TRAMP_MAX, sizeof(struct bpf_tramp_progs),
>
> nit: sizeof(*tprogs) ?

Sure. I can fix it.

>
> > + GFP_KERNEL);
> > + if (!tprogs) {
> > + err = -ENOMEM;
> > + goto reset_unlock;
> > + }
> > +
> > + *tprogs[BPF_TRAMP_FENTRY].progs = prog;
>
> I'm very confused what's going on here, why * at the beginning here,
> but no * below?.. It seems unnecessary.

The progs member of bpf_tramp_progs is

struct bpf_tramp_progs {
struct bpf_prog *progs[BPF_MAX_TRAMP_PROGS];
int nr_progs;
};

Equivalent to the **progs we had before in the signature of
arch_prepare_bpf_trampoline.

*tprogs[BPF_TRAMP_FENTRY].progs = prog;

is setting the program in the **progs. The one below is setting the
count. Am I missing something :)

>
> > + tprogs[BPF_TRAMP_FENTRY].nr_progs = 1;
> > err = arch_prepare_bpf_trampoline(image,
> > st_map->image + PAGE_SIZE,
> > &st_ops->func_models[i], 0,
> > - &prog, 1, NULL, 0, NULL);
> > + tprogs, NULL);
> > if (err < 0)
> > goto reset_unlock;
> >
> > @@ -469,6 +479,7 @@ static int bpf_struct_ops_map_update_elem(struct bpf_map *map, void *key,
> > memset(uvalue, 0, map->value_size);
> > memset(kvalue, 0, map->value_size);
> > unlock:
> > + kfree(tprogs);
> > mutex_unlock(&st_map->lock);
> > return err;
> > }
> > diff --git a/kernel/bpf/trampoline.c b/kernel/bpf/trampoline.c
> > index 704fa787fec0..9daeb094f054 100644
> > --- a/kernel/bpf/trampoline.c
> > +++ b/kernel/bpf/trampoline.c
> > @@ -190,40 +190,50 @@ static int register_fentry(struct bpf_trampoline *tr, void *new_addr)
> > return ret;
> > }
> >
> > -/* Each call __bpf_prog_enter + call bpf_func + call __bpf_prog_exit is ~50
> > - * bytes on x86. Pick a number to fit into BPF_IMAGE_SIZE / 2
> > - */
> > -#define BPF_MAX_TRAMP_PROGS 40
> > +static struct bpf_tramp_progs *
> > +bpf_trampoline_update_progs(struct bpf_trampoline *tr, int *total)
> > +{
> > + struct bpf_tramp_progs *tprogs;
> > + struct bpf_prog **progs;
> > + struct bpf_prog_aux *aux;
> > + int kind;
> > +
> > + *total = 0;
> > + tprogs = kcalloc(BPF_TRAMP_MAX, sizeof(struct bpf_tramp_progs),
>
> same nit as above, sizeof(*tprogs) is shorter and less error-prone

Sure, can fix it.

- KP

>
> > + GFP_KERNEL);
> > + if (!tprogs)
> > + return ERR_PTR(-ENOMEM);
> > +
> > + for (kind = 0; kind < BPF_TRAMP_MAX; kind++) {
> > + tprogs[kind].nr_progs = tr->progs_cnt[kind];
> > + *total += tr->progs_cnt[kind];
> > + progs = tprogs[kind].progs;
> > +
> > + hlist_for_each_entry(aux, &tr->progs_hlist[kind], tramp_hlist)
> > + *progs++ = aux->prog;
> > + }
> > + return tprogs;
> > +}
> >
>
> [...]

2020-03-03 22:27:21

by KP Singh

[permalink] [raw]
Subject: Re: [PATCH bpf-next 0/7] Introduce BPF_MODIFY_RET tracing progs.

On 03-M?r 14:12, Andrii Nakryiko wrote:
> On Tue, Mar 3, 2020 at 6:12 AM KP Singh <[email protected]> wrote:
> >
> > From: KP Singh <[email protected]>
> >
> > This was brought up in the KRSI v4 discussion and found to be useful
> > both for security and tracing programs.
> >
> > https://lore.kernel.org/bpf/[email protected]/
> >
> > The modify_return programs are allowed for security hooks (with an
> > extra CAP_MAC_ADMIN check) and functions whitelisted for error
> > injection (ALLOW_ERROR_INJECTION).
> >
> > The "security_" check is expected to be cleaned up with the KRSI patch
> > series.
> >
> > Here is an example of how a fmod_ret program behaves:
> >
> > int func_to_be_attached(int a, int b)
> > { <--- do_fentry
> >
> > do_fmod_ret:
> > <update ret by calling fmod_ret>
> > if (ret != 0)
> > goto do_fexit;
> >
> > original_function:
> >
> > <side_effects_happen_here>
> >
> > } <--- do_fexit
> >
> > ALLOW_ERROR_INJECTION(func_to_be_attached, ERRNO)
> >
> > The fmod_ret program attached to this function can be defined as:
> >
> > SEC("fmod_ret/func_to_be_attached")
> > BPF_PROG(func_name, int a, int b, int ret)
>
> nit: int BPF_PROG(...)

Noted. Thanks!

- KP

>
>
> > {
> > // This will skip the original function logic.
> > return -1;
> > }
> >
> > KP Singh (7):
> > bpf: Refactor trampoline update code
> > bpf: JIT helpers for fmod_ret progs
> > bpf: Introduce BPF_MODIFY_RETURN
> > bpf: Attachment verification for BPF_MODIFY_RETURN
> > tools/libbpf: Add support for BPF_MODIFY_RETURN
> > bpf: Add test ops for BPF_PROG_TYPE_TRACING
> > bpf: Add selftests for BPF_MODIFY_RETURN
> >
> > arch/x86/net/bpf_jit_comp.c | 261 +++++++++++++-----
> > include/linux/bpf.h | 24 +-
> > include/uapi/linux/bpf.h | 1 +
> > kernel/bpf/bpf_struct_ops.c | 13 +-
> > kernel/bpf/btf.c | 27 +-
> > kernel/bpf/syscall.c | 1 +
> > kernel/bpf/trampoline.c | 66 +++--
> > kernel/bpf/verifier.c | 32 +++
> > kernel/trace/bpf_trace.c | 1 +
> > net/bpf/test_run.c | 57 +++-
> > tools/include/uapi/linux/bpf.h | 1 +
> > tools/lib/bpf/libbpf.c | 4 +
> > .../selftests/bpf/prog_tests/fentry_fexit.c | 12 +-
> > .../selftests/bpf/prog_tests/fentry_test.c | 14 +-
> > .../selftests/bpf/prog_tests/fexit_test.c | 69 ++---
> > .../selftests/bpf/prog_tests/modify_return.c | 65 +++++
> > .../selftests/bpf/progs/modify_return.c | 49 ++++
> > 17 files changed, 509 insertions(+), 188 deletions(-)
> > create mode 100644 tools/testing/selftests/bpf/prog_tests/modify_return.c
> > create mode 100644 tools/testing/selftests/bpf/progs/modify_return.c
> >
> > --
> > 2.20.1
> >

2020-03-03 22:27:22

by Andrii Nakryiko

[permalink] [raw]
Subject: Re: [PATCH bpf-next 2/7] bpf: JIT helpers for fmod_ret progs

On Tue, Mar 3, 2020 at 6:13 AM KP Singh <[email protected]> wrote:
>
> From: KP Singh <[email protected]>
>
> * Split the invoke_bpf program to prepare for special handling of
> fmod_ret programs introduced in a subsequent patch.
> * Move the definition of emit_cond_near_jump and emit_nops as they are
> needed for fmod_ret.
> * Refactor branch target alignment into its own function
> align16_branch_target.
>
> Signed-off-by: KP Singh <[email protected]>
> ---
> arch/x86/net/bpf_jit_comp.c | 158 ++++++++++++++++++++----------------
> 1 file changed, 90 insertions(+), 68 deletions(-)
>
> diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c
> index 15c7d28bc05c..475e354c2e88 100644
> --- a/arch/x86/net/bpf_jit_comp.c
> +++ b/arch/x86/net/bpf_jit_comp.c
> @@ -1361,35 +1361,100 @@ static void restore_regs(const struct btf_func_model *m, u8 **prog, int nr_args,
> -(stack_size - i * 8));
> }
>

[...]

> +
> +/* From Intel 64 and IA-32 Architectures Optimization
> + * Reference Manual, 3.4.1.4 Code Alignment, Assembly/Compiler
> + * Coding Rule 11: All branch targets should be 16-byte
> + * aligned.
> + */
> +static void align16_branch_target(u8 **pprog)
> +{
> + u8 *target, *prog = *pprog;
> +
> + target = PTR_ALIGN(prog, 16);
> + if (target != prog)
> + emit_nops(&prog, target - prog);
> + if (target != prog)
> + pr_err("calcultion error\n");

this wasn't in the original code, do you feel like it's more important
to check this and print error?

also typo: calculation error, but then it's a bit brief and
uninformative message. So I don't know, maybe just drop it?

> +}
> +
> +static int emit_cond_near_jump(u8 **pprog, void *func, void *ip, u8 jmp_cond)
> +{
> + u8 *prog = *pprog;
> + int cnt = 0;
> + s64 offset;
> +
> + offset = func - (ip + 2 + 4);
> + if (!is_simm32(offset)) {
> + pr_err("Target %p is out of range\n", func);
> + return -EINVAL;
> + }
> + EMIT2_off32(0x0F, jmp_cond + 0x10, offset);
> + *pprog = prog;
> + return 0;
> +}
> +

[...]

2020-03-03 22:28:41

by KP Singh

[permalink] [raw]
Subject: Re: [PATCH bpf-next 2/7] bpf: JIT helpers for fmod_ret progs

On 03-M?r 14:26, Andrii Nakryiko wrote:
> On Tue, Mar 3, 2020 at 6:13 AM KP Singh <[email protected]> wrote:
> >
> > From: KP Singh <[email protected]>
> >
> > * Split the invoke_bpf program to prepare for special handling of
> > fmod_ret programs introduced in a subsequent patch.
> > * Move the definition of emit_cond_near_jump and emit_nops as they are
> > needed for fmod_ret.
> > * Refactor branch target alignment into its own function
> > align16_branch_target.
> >
> > Signed-off-by: KP Singh <[email protected]>
> > ---
> > arch/x86/net/bpf_jit_comp.c | 158 ++++++++++++++++++++----------------
> > 1 file changed, 90 insertions(+), 68 deletions(-)
> >
> > diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c
> > index 15c7d28bc05c..475e354c2e88 100644
> > --- a/arch/x86/net/bpf_jit_comp.c
> > +++ b/arch/x86/net/bpf_jit_comp.c
> > @@ -1361,35 +1361,100 @@ static void restore_regs(const struct btf_func_model *m, u8 **prog, int nr_args,
> > -(stack_size - i * 8));
> > }
> >
>
> [...]
>
> > +
> > +/* From Intel 64 and IA-32 Architectures Optimization
> > + * Reference Manual, 3.4.1.4 Code Alignment, Assembly/Compiler
> > + * Coding Rule 11: All branch targets should be 16-byte
> > + * aligned.
> > + */
> > +static void align16_branch_target(u8 **pprog)
> > +{
> > + u8 *target, *prog = *pprog;
> > +
> > + target = PTR_ALIGN(prog, 16);
> > + if (target != prog)
> > + emit_nops(&prog, target - prog);
> > + if (target != prog)
> > + pr_err("calcultion error\n");
>
> this wasn't in the original code, do you feel like it's more important
> to check this and print error?
>
> also typo: calculation error, but then it's a bit brief and
> uninformative message. So I don't know, maybe just drop it?

Ah, good catch! this is deinitely not intended to be here.
It's a debug artifact and needs to dropped indeed.

- KP

>
> > +}
> > +
> > +static int emit_cond_near_jump(u8 **pprog, void *func, void *ip, u8 jmp_cond)
> > +{
> > + u8 *prog = *pprog;
> > + int cnt = 0;
> > + s64 offset;
> > +
> > + offset = func - (ip + 2 + 4);
> > + if (!is_simm32(offset)) {
> > + pr_err("Target %p is out of range\n", func);
> > + return -EINVAL;
> > + }
> > + EMIT2_off32(0x0F, jmp_cond + 0x10, offset);
> > + *pprog = prog;
> > + return 0;
> > +}
> > +
>
> [...]

2020-03-03 22:38:36

by Andrii Nakryiko

[permalink] [raw]
Subject: Re: [PATCH bpf-next 3/7] bpf: Introduce BPF_MODIFY_RETURN

On Tue, Mar 3, 2020 at 6:12 AM KP Singh <[email protected]> wrote:
>
> From: KP Singh <[email protected]>
>
> When multiple programs are attached, each program receives the return
> value from the previous program on the stack and the last program
> provides the return value to the attached function.
>
> The fmod_ret bpf programs are run after the fentry programs and before
> the fexit programs. The original function is only called if all the
> fmod_ret programs return 0 to avoid any unintended side-effects. The
> success value, i.e. 0 is not currently configurable but can be made so
> where user-space can specify it at load time.
>
> For example:
>
> int func_to_be_attached(int a, int b)
> { <--- do_fentry
>
> do_fmod_ret:
> <update ret by calling fmod_ret>
> if (ret != 0)
> goto do_fexit;
>
> original_function:
>
> <side_effects_happen_here>
>
> } <--- do_fexit
>
> The fmod_ret program attached to this function can be defined as:
>
> SEC("fmod_ret/func_to_be_attached")
> BPF_PROG(func_name, int a, int b, int ret)

same as on cover letter, return type is missing

> {
> // This will skip the original function logic.
> return 1;
> }
>
> The first fmod_ret program is passed 0 in its return argument.
>
> Signed-off-by: KP Singh <[email protected]>
> ---
> arch/x86/net/bpf_jit_comp.c | 96 ++++++++++++++++++++++++++++++++--
> include/linux/bpf.h | 1 +
> include/uapi/linux/bpf.h | 1 +
> kernel/bpf/btf.c | 3 +-
> kernel/bpf/syscall.c | 1 +
> kernel/bpf/trampoline.c | 5 +-
> kernel/bpf/verifier.c | 1 +
> tools/include/uapi/linux/bpf.h | 1 +
> 8 files changed, 103 insertions(+), 6 deletions(-)
>

[...]

>
> + if (fmod_ret->nr_progs) {
> + branches = kcalloc(fmod_ret->nr_progs, sizeof(u8 *),
> + GFP_KERNEL);
> + if (!branches)
> + return -ENOMEM;
> + if (invoke_bpf_mod_ret(m, &prog, fmod_ret, stack_size,
> + branches))

branches leaks here

> + return -EINVAL;
> + }
> +
> if (flags & BPF_TRAMP_F_CALL_ORIG) {
> - if (fentry->nr_progs)
> + if (fentry->nr_progs || fmod_ret->nr_progs)
> restore_regs(m, &prog, nr_args, stack_size);
>
> /* call original function */
> @@ -1573,6 +1649,14 @@ int arch_prepare_bpf_trampoline(void *image, void *image_end,

there is early return one line above here, you need to free branches
in that case to not leak memory

So I guess it's better to do goto cleanup approach at this point?

> emit_stx(&prog, BPF_DW, BPF_REG_FP, BPF_REG_0, -8);
> }
>
> + if (fmod_ret->nr_progs) {
> + align16_branch_target(&prog);
> + for (i = 0; i < fmod_ret->nr_progs; i++)
> + emit_cond_near_jump(&branches[i], prog, branches[i],
> + X86_JNE);
> + kfree(branches);
> + }
> +

[...]

2020-03-03 22:46:07

by Andrii Nakryiko

[permalink] [raw]
Subject: Re: [PATCH bpf-next 5/7] tools/libbpf: Add support for BPF_MODIFY_RETURN

On Tue, Mar 3, 2020 at 6:12 AM KP Singh <[email protected]> wrote:
>
> From: KP Singh <[email protected]>
>
> Signed-off-by: KP Singh <[email protected]>
> ---

Acked-by: Andrii Nakryiko <[email protected]>

> tools/lib/bpf/libbpf.c | 4 ++++
> 1 file changed, 4 insertions(+)
>
> diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
> index f8c4042e5855..223be01dc466 100644
> --- a/tools/lib/bpf/libbpf.c
> +++ b/tools/lib/bpf/libbpf.c
> @@ -6288,6 +6288,10 @@ static const struct bpf_sec_def section_defs[] = {
> .expected_attach_type = BPF_TRACE_FENTRY,
> .is_attach_btf = true,
> .attach_fn = attach_trace),
> + SEC_DEF("fmod_ret/", TRACING,
> + .expected_attach_type = BPF_MODIFY_RETURN,
> + .is_attach_btf = true,
> + .attach_fn = attach_trace),
> SEC_DEF("fexit/", TRACING,
> .expected_attach_type = BPF_TRACE_FEXIT,
> .is_attach_btf = true,
> --
> 2.20.1
>

2020-03-03 22:46:41

by Andrii Nakryiko

[permalink] [raw]
Subject: Re: [PATCH bpf-next 4/7] bpf: Attachment verification for BPF_MODIFY_RETURN

On Tue, Mar 3, 2020 at 6:12 AM KP Singh <[email protected]> wrote:
>
> From: KP Singh <[email protected]>
>
> - Functions that are whitlisted by for error injection i.e.
> within_error_injection_list.
> - Security hooks, this is expected to be cleaned up after the KRSI
> patches introduce the LSM_HOOK macro:
>
> https://lore.kernel.org/bpf/[email protected]/

Commit message can use a bit more work for sure. Why (and even what)
of the changes is not really explained well.

>
> - The attachment is currently limited to functions that return an int.
> This can be extended later other types (e.g. PTR).
>
> Signed-off-by: KP Singh <[email protected]>
> ---
> kernel/bpf/btf.c | 28 ++++++++++++++++++++--------
> kernel/bpf/verifier.c | 31 +++++++++++++++++++++++++++++++
> 2 files changed, 51 insertions(+), 8 deletions(-)
>
> diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c
> index 30841fb8b3c0..50080add2ab9 100644
> --- a/kernel/bpf/btf.c
> +++ b/kernel/bpf/btf.c
> @@ -3710,14 +3710,26 @@ bool btf_ctx_access(int off, int size, enum bpf_access_type type,
> nr_args--;
> }
>
> - if ((prog->expected_attach_type == BPF_TRACE_FEXIT ||
> - prog->expected_attach_type == BPF_MODIFY_RETURN) &&
> - arg == nr_args) {
> - if (!t)
> - /* Default prog with 5 args. 6th arg is retval. */
> - return true;
> - /* function return type */
> - t = btf_type_by_id(btf, t->type);
> + if (arg == nr_args) {
> + if (prog->expected_attach_type == BPF_TRACE_FEXIT) {
> + if (!t)
> + return true;
> + t = btf_type_by_id(btf, t->type);
> + } else if (prog->expected_attach_type == BPF_MODIFY_RETURN) {
> + /* For now the BPF_MODIFY_RETURN can only be attached to
> + * functions that return an int.
> + */
> + if (!t)
> + return false;
> +
> + t = btf_type_skip_modifiers(btf, t->type, NULL);
> + if (!btf_type_is_int(t)) {

Should the size of int be verified here? E.g., if some function
returns u8, is that ok for BPF program to return, say, (1<<30) ?

> + bpf_log(log,
> + "ret type %s not allowed for fmod_ret\n",
> + btf_kind_str[BTF_INFO_KIND(t->info)]);
> + return false;
> + }
> + }
> } else if (arg >= nr_args) {
> bpf_log(log, "func '%s' doesn't have %d-th argument\n",
> tname, arg + 1);
> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> index 2460c8e6b5be..ae32517d4ccd 100644
> --- a/kernel/bpf/verifier.c
> +++ b/kernel/bpf/verifier.c
> @@ -19,6 +19,7 @@
> #include <linux/sort.h>
> #include <linux/perf_event.h>
> #include <linux/ctype.h>
> +#include <linux/error-injection.h>
>
> #include "disasm.h"
>
> @@ -9800,6 +9801,33 @@ static int check_struct_ops_btf_id(struct bpf_verifier_env *env)
>
> return 0;
> }
> +#define SECURITY_PREFIX "security_"
> +
> +static int check_attach_modify_return(struct bpf_verifier_env *env)
> +{
> + struct bpf_prog *prog = env->prog;
> + unsigned long addr = (unsigned long) prog->aux->trampoline->func.addr;
> +
> + if (within_error_injection_list(addr))
> + return 0;
> +
> + /* This is expected to be cleaned up in the future with the KRSI effort
> + * introducing the LSM_HOOK macro for cleaning up lsm_hooks.h.
> + */
> + if (!strncmp(SECURITY_PREFIX, prog->aux->attach_func_name,
> + sizeof(SECURITY_PREFIX) - 1)) {
> +
> + if (!capable(CAP_MAC_ADMIN))
> + return -EPERM;
> +
> + return 0;
> + }
> +
> + verbose(env, "fmod_ret attach_btf_id %u (%s) is not modifiable\n",
> + prog->aux->attach_btf_id, prog->aux->attach_func_name);
> +
> + return -EINVAL;
> +}
>
> static int check_attach_btf_id(struct bpf_verifier_env *env)
> {
> @@ -10000,6 +10028,9 @@ static int check_attach_btf_id(struct bpf_verifier_env *env)
> }
> tr->func.addr = (void *)addr;
> prog->aux->trampoline = tr;
> +
> + if (prog->expected_attach_type == BPF_MODIFY_RETURN)
> + ret = check_attach_modify_return(env);
> out:
> mutex_unlock(&tr->mutex);
> if (ret)
> --
> 2.20.1
>

2020-03-03 22:52:00

by KP Singh

[permalink] [raw]
Subject: Re: [PATCH bpf-next 3/7] bpf: Introduce BPF_MODIFY_RETURN

On 03-M?r 14:37, Andrii Nakryiko wrote:
> On Tue, Mar 3, 2020 at 6:12 AM KP Singh <[email protected]> wrote:
> >
> > From: KP Singh <[email protected]>
> >
> > When multiple programs are attached, each program receives the return
> > value from the previous program on the stack and the last program
> > provides the return value to the attached function.
> >
> > The fmod_ret bpf programs are run after the fentry programs and before
> > the fexit programs. The original function is only called if all the
> > fmod_ret programs return 0 to avoid any unintended side-effects. The
> > success value, i.e. 0 is not currently configurable but can be made so
> > where user-space can specify it at load time.
> >
> > For example:
> >
> > int func_to_be_attached(int a, int b)
> > { <--- do_fentry
> >
> > do_fmod_ret:
> > <update ret by calling fmod_ret>
> > if (ret != 0)
> > goto do_fexit;
> >
> > original_function:
> >
> > <side_effects_happen_here>
> >
> > } <--- do_fexit
> >
> > The fmod_ret program attached to this function can be defined as:
> >
> > SEC("fmod_ret/func_to_be_attached")
> > BPF_PROG(func_name, int a, int b, int ret)
>
> same as on cover letter, return type is missing

Fixed. Thanks!

>
> > {
> > // This will skip the original function logic.
> > return 1;
> > }
> >
> > The first fmod_ret program is passed 0 in its return argument.
> >
> > Signed-off-by: KP Singh <[email protected]>
> > ---
> > arch/x86/net/bpf_jit_comp.c | 96 ++++++++++++++++++++++++++++++++--
> > include/linux/bpf.h | 1 +
> > include/uapi/linux/bpf.h | 1 +
> > kernel/bpf/btf.c | 3 +-
> > kernel/bpf/syscall.c | 1 +
> > kernel/bpf/trampoline.c | 5 +-
> > kernel/bpf/verifier.c | 1 +
> > tools/include/uapi/linux/bpf.h | 1 +
> > 8 files changed, 103 insertions(+), 6 deletions(-)
> >
>
> [...]
>
> >
> > + if (fmod_ret->nr_progs) {
> > + branches = kcalloc(fmod_ret->nr_progs, sizeof(u8 *),
> > + GFP_KERNEL);
> > + if (!branches)
> > + return -ENOMEM;
> > + if (invoke_bpf_mod_ret(m, &prog, fmod_ret, stack_size,
> > + branches))
>
> branches leaks here

Good catch, sloppy work here by me.

>
> > + return -EINVAL;
> > + }
> > +
> > if (flags & BPF_TRAMP_F_CALL_ORIG) {
> > - if (fentry->nr_progs)
> > + if (fentry->nr_progs || fmod_ret->nr_progs)
> > restore_regs(m, &prog, nr_args, stack_size);
> >
> > /* call original function */
> > @@ -1573,6 +1649,14 @@ int arch_prepare_bpf_trampoline(void *image, void *image_end,
>
> there is early return one line above here, you need to free branches
> in that case to not leak memory
>
> So I guess it's better to do goto cleanup approach at this point?

yeah, agreed, updated to doing a cleanup at the end.

- KP

>
> > emit_stx(&prog, BPF_DW, BPF_REG_FP, BPF_REG_0, -8);
> > }
> >
> > + if (fmod_ret->nr_progs) {
> > + align16_branch_target(&prog);
> > + for (i = 0; i < fmod_ret->nr_progs; i++)
> > + emit_cond_near_jump(&branches[i], prog, branches[i],
> > + X86_JNE);
> > + kfree(branches);
> > + }
> > +
>
> [...]

2020-03-03 22:53:31

by Andrii Nakryiko

[permalink] [raw]
Subject: Re: [PATCH bpf-next 6/7] bpf: Add test ops for BPF_PROG_TYPE_TRACING

On Tue, Mar 3, 2020 at 6:12 AM KP Singh <[email protected]> wrote:
>
> From: KP Singh <[email protected]>
>
> The current fexit and fentry tests rely on a different program to
> exercise the functions they attach to. Instead of doing this, implement
> the test operations for tracing which will also be used for
> BPF_OVERRIDE_RETURN in a subsequent patch.

typo: BPF_OVERRIDE_RETURN -> BPF_MODIFY_RETURN?

>
> Also, clean up the fexit test to use the generated skeleton.
>
> Signed-off-by: KP Singh <[email protected]>
> ---

Nice clean up for fexit_test, thank you!

> include/linux/bpf.h | 10 +++
> kernel/trace/bpf_trace.c | 1 +
> net/bpf/test_run.c | 38 +++++++---
> .../selftests/bpf/prog_tests/fentry_fexit.c | 12 +---
> .../selftests/bpf/prog_tests/fentry_test.c | 14 ++--
> .../selftests/bpf/prog_tests/fexit_test.c | 69 ++++++-------------
> 6 files changed, 68 insertions(+), 76 deletions(-)
>
> diff --git a/include/linux/bpf.h b/include/linux/bpf.h
> index 3cfdc216a2f4..c00919025532 100644
> --- a/include/linux/bpf.h
> +++ b/include/linux/bpf.h
> @@ -1156,6 +1156,9 @@ int bpf_prog_test_run_xdp(struct bpf_prog *prog, const union bpf_attr *kattr,
> union bpf_attr __user *uattr);
> int bpf_prog_test_run_skb(struct bpf_prog *prog, const union bpf_attr *kattr,
> union bpf_attr __user *uattr);
> +int bpf_prog_test_run_tracing(struct bpf_prog *prog,
> + const union bpf_attr *kattr,
> + union bpf_attr __user *uattr);
> int bpf_prog_test_run_flow_dissector(struct bpf_prog *prog,
> const union bpf_attr *kattr,
> union bpf_attr __user *uattr);
> @@ -1313,6 +1316,13 @@ static inline int bpf_prog_test_run_skb(struct bpf_prog *prog,
> return -ENOTSUPP;
> }
>
> +static inline int bpf_prog_test_run_tracing(struct bpf_prog *prog,
> + const union bpf_attr *kattr,
> + union bpf_attr __user *uattr)
> +{
> + return -ENOTSUPP;
> +}
> +
> static inline int bpf_prog_test_run_flow_dissector(struct bpf_prog *prog,
> const union bpf_attr *kattr,
> union bpf_attr __user *uattr)
> diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
> index 07764c761073..363e0a2c75cf 100644
> --- a/kernel/trace/bpf_trace.c
> +++ b/kernel/trace/bpf_trace.c
> @@ -1266,6 +1266,7 @@ const struct bpf_verifier_ops tracing_verifier_ops = {
> };
>
> const struct bpf_prog_ops tracing_prog_ops = {
> + .test_run = bpf_prog_test_run_tracing,
> };
>
> static bool raw_tp_writable_prog_is_valid_access(int off, int size,
> diff --git a/net/bpf/test_run.c b/net/bpf/test_run.c
> index 562443f94133..fb54b45285b4 100644
> --- a/net/bpf/test_run.c
> +++ b/net/bpf/test_run.c
> @@ -160,18 +160,38 @@ static void *bpf_test_init(const union bpf_attr *kattr, u32 size,
> kfree(data);
> return ERR_PTR(-EFAULT);
> }
> - if (bpf_fentry_test1(1) != 2 ||
> - bpf_fentry_test2(2, 3) != 5 ||
> - bpf_fentry_test3(4, 5, 6) != 15 ||
> - bpf_fentry_test4((void *)7, 8, 9, 10) != 34 ||
> - bpf_fentry_test5(11, (void *)12, 13, 14, 15) != 65 ||
> - bpf_fentry_test6(16, (void *)17, 18, 19, (void *)20, 21) != 111) {
> - kfree(data);
> - return ERR_PTR(-EFAULT);
> - }
> +
> return data;
> }
>
> +int bpf_prog_test_run_tracing(struct bpf_prog *prog,
> + const union bpf_attr *kattr,
> + union bpf_attr __user *uattr)
> +{
> + int err = -EFAULT;
> +
> + switch (prog->expected_attach_type) {
> + case BPF_TRACE_FENTRY:
> + case BPF_TRACE_FEXIT:
> + if (bpf_fentry_test1(1) != 2 ||
> + bpf_fentry_test2(2, 3) != 5 ||
> + bpf_fentry_test3(4, 5, 6) != 15 ||
> + bpf_fentry_test4((void *)7, 8, 9, 10) != 34 ||
> + bpf_fentry_test5(11, (void *)12, 13, 14, 15) != 65 ||
> + bpf_fentry_test6(16, (void *)17, 18, 19, (void *)20, 21) != 111)
> + goto out;
> + break;
> + default:
> + goto out;
> + }

No trace_bpf_test_finish here?

> +
> + return 0;
> +
> +out:
> + trace_bpf_test_finish(&err);
> + return err;
> +}
> +
> static void *bpf_ctx_init(const union bpf_attr *kattr, u32 max_size)
> {
> void __user *data_in = u64_to_user_ptr(kattr->test.ctx_in);

[...]

2020-03-03 22:58:58

by Andrii Nakryiko

[permalink] [raw]
Subject: Re: [PATCH bpf-next 7/7] bpf: Add selftests for BPF_MODIFY_RETURN

On Tue, Mar 3, 2020 at 6:13 AM KP Singh <[email protected]> wrote:
>
> From: KP Singh <[email protected]>
>
> Test for two scenarios:
>
> * When the fmod_ret program returns 0, the original function should
> be called along with fentry and fexit programs.
> * When the fmod_ret program returns a non-zero value, the original
> function should not be called, no side effect should be observed and
> fentry and fexit programs should be called.
>
> The result from the kernel function call and whether a side-effect is
> observed is returned via the retval attr of the BPF_PROG_TEST_RUN (bpf)
> syscall.
>
> Signed-off-by: KP Singh <[email protected]>
> ---

minor nits only

Acked-by: Andrii Nakryiko <[email protected]>

> net/bpf/test_run.c | 23 ++++++-
> .../selftests/bpf/prog_tests/modify_return.c | 65 +++++++++++++++++++
> .../selftests/bpf/progs/modify_return.c | 49 ++++++++++++++
> 3 files changed, 135 insertions(+), 2 deletions(-)
> create mode 100644 tools/testing/selftests/bpf/prog_tests/modify_return.c
> create mode 100644 tools/testing/selftests/bpf/progs/modify_return.c
>

[...]

>
> - return 0;
> + retval = (u32)side_effect << 16 | ret;

uhm, I didn't look up operator priority table, but I'd rather have ()
around bit shift operation :)

> + if (copy_to_user(&uattr->test.retval, &retval, sizeof(retval)))
> + goto out;
>
> + return 0;
> out:
> trace_bpf_test_finish(&err);
> return err;
> diff --git a/tools/testing/selftests/bpf/prog_tests/modify_return.c b/tools/testing/selftests/bpf/prog_tests/modify_return.c
> new file mode 100644
> index 000000000000..beab9a37f35c
> --- /dev/null
> +++ b/tools/testing/selftests/bpf/prog_tests/modify_return.c
> @@ -0,0 +1,65 @@
> +// SPDX-License-Identifier: GPL-2.0
> +
> +/*
> + * Copyright 2020 Google LLC.
> + */
> +
> +#include <test_progs.h>
> +#include "modify_return.skel.h"
> +
> +#define LOWER(x) (x & 0xffff)
> +#define UPPER(x) (x >> 16)

pedantic nit: (x) instead of just x

> +
> +

[...]

2020-03-03 22:59:13

by KP Singh

[permalink] [raw]
Subject: Re: [PATCH bpf-next 6/7] bpf: Add test ops for BPF_PROG_TYPE_TRACING

On 03-M?r 14:51, Andrii Nakryiko wrote:
> On Tue, Mar 3, 2020 at 6:12 AM KP Singh <[email protected]> wrote:
> >
> > From: KP Singh <[email protected]>
> >
> > The current fexit and fentry tests rely on a different program to
> > exercise the functions they attach to. Instead of doing this, implement
> > the test operations for tracing which will also be used for
> > BPF_OVERRIDE_RETURN in a subsequent patch.
>
> typo: BPF_OVERRIDE_RETURN -> BPF_MODIFY_RETURN?

Oops :) Fixed. Thanks! Artifacts of renaming.

>
> >
> > Also, clean up the fexit test to use the generated skeleton.
> >
> > Signed-off-by: KP Singh <[email protected]>
> > ---
>
> Nice clean up for fexit_test, thank you!

It was very satisfying :)

>
> > include/linux/bpf.h | 10 +++
> > kernel/trace/bpf_trace.c | 1 +
> > net/bpf/test_run.c | 38 +++++++---
> > .../selftests/bpf/prog_tests/fentry_fexit.c | 12 +---
> > .../selftests/bpf/prog_tests/fentry_test.c | 14 ++--
> > .../selftests/bpf/prog_tests/fexit_test.c | 69 ++++++-------------
> > 6 files changed, 68 insertions(+), 76 deletions(-)
> >
> > diff --git a/include/linux/bpf.h b/include/linux/bpf.h
> > index 3cfdc216a2f4..c00919025532 100644
> > --- a/include/linux/bpf.h
> > +++ b/include/linux/bpf.h
> > @@ -1156,6 +1156,9 @@ int bpf_prog_test_run_xdp(struct bpf_prog *prog, const union bpf_attr *kattr,
> > union bpf_attr __user *uattr);
> > int bpf_prog_test_run_skb(struct bpf_prog *prog, const union bpf_attr *kattr,
> > union bpf_attr __user *uattr);
> > +int bpf_prog_test_run_tracing(struct bpf_prog *prog,
> > + const union bpf_attr *kattr,
> > + union bpf_attr __user *uattr);
> > int bpf_prog_test_run_flow_dissector(struct bpf_prog *prog,
> > const union bpf_attr *kattr,
> > union bpf_attr __user *uattr);
> > @@ -1313,6 +1316,13 @@ static inline int bpf_prog_test_run_skb(struct bpf_prog *prog,
> > return -ENOTSUPP;
> > }
> >
> > +static inline int bpf_prog_test_run_tracing(struct bpf_prog *prog,
> > + const union bpf_attr *kattr,
> > + union bpf_attr __user *uattr)
> > +{
> > + return -ENOTSUPP;
> > +}
> > +
> > static inline int bpf_prog_test_run_flow_dissector(struct bpf_prog *prog,
> > const union bpf_attr *kattr,
> > union bpf_attr __user *uattr)
> > diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
> > index 07764c761073..363e0a2c75cf 100644
> > --- a/kernel/trace/bpf_trace.c
> > +++ b/kernel/trace/bpf_trace.c
> > @@ -1266,6 +1266,7 @@ const struct bpf_verifier_ops tracing_verifier_ops = {
> > };
> >
> > const struct bpf_prog_ops tracing_prog_ops = {
> > + .test_run = bpf_prog_test_run_tracing,
> > };
> >
> > static bool raw_tp_writable_prog_is_valid_access(int off, int size,
> > diff --git a/net/bpf/test_run.c b/net/bpf/test_run.c
> > index 562443f94133..fb54b45285b4 100644
> > --- a/net/bpf/test_run.c
> > +++ b/net/bpf/test_run.c
> > @@ -160,18 +160,38 @@ static void *bpf_test_init(const union bpf_attr *kattr, u32 size,
> > kfree(data);
> > return ERR_PTR(-EFAULT);
> > }
> > - if (bpf_fentry_test1(1) != 2 ||
> > - bpf_fentry_test2(2, 3) != 5 ||
> > - bpf_fentry_test3(4, 5, 6) != 15 ||
> > - bpf_fentry_test4((void *)7, 8, 9, 10) != 34 ||
> > - bpf_fentry_test5(11, (void *)12, 13, 14, 15) != 65 ||
> > - bpf_fentry_test6(16, (void *)17, 18, 19, (void *)20, 21) != 111) {
> > - kfree(data);
> > - return ERR_PTR(-EFAULT);
> > - }
> > +
> > return data;
> > }
> >
> > +int bpf_prog_test_run_tracing(struct bpf_prog *prog,
> > + const union bpf_attr *kattr,
> > + union bpf_attr __user *uattr)
> > +{
> > + int err = -EFAULT;
> > +
> > + switch (prog->expected_attach_type) {
> > + case BPF_TRACE_FENTRY:
> > + case BPF_TRACE_FEXIT:
> > + if (bpf_fentry_test1(1) != 2 ||
> > + bpf_fentry_test2(2, 3) != 5 ||
> > + bpf_fentry_test3(4, 5, 6) != 15 ||
> > + bpf_fentry_test4((void *)7, 8, 9, 10) != 34 ||
> > + bpf_fentry_test5(11, (void *)12, 13, 14, 15) != 65 ||
> > + bpf_fentry_test6(16, (void *)17, 18, 19, (void *)20, 21) != 111)
> > + goto out;
> > + break;
> > + default:
> > + goto out;
> > + }
>
> No trace_bpf_test_finish here?

Ah yes, we trace it not ony for erroneous cases. Changed it to
setting err = 0 and falling through to the trace_bpf_test_finish.

- KP

>
> > +
> > + return 0;
> > +
> > +out:
> > + trace_bpf_test_finish(&err);
> > + return err;
> > +}
> > +
> > static void *bpf_ctx_init(const union bpf_attr *kattr, u32 max_size)
> > {
> > void __user *data_in = u64_to_user_ptr(kattr->test.ctx_in);
>
> [...]

2020-03-03 23:04:50

by Andrii Nakryiko

[permalink] [raw]
Subject: Re: [PATCH bpf-next 1/7] bpf: Refactor trampoline update code

On Tue, Mar 3, 2020 at 2:24 PM KP Singh <[email protected]> wrote:
>
> On 03-Mär 14:12, Andrii Nakryiko wrote:
> > On Tue, Mar 3, 2020 at 6:13 AM KP Singh <[email protected]> wrote:
> > >
> > > From: KP Singh <[email protected]>
> > >
> > > As we need to introduce a third type of attachment for trampolines, the
> > > flattened signature of arch_prepare_bpf_trampoline gets even more
> > > complicated.
> > >
> > > Refactor the prog and count argument to arch_prepare_bpf_trampoline to
> > > use bpf_tramp_progs to simplify the addition and accounting for new
> > > attachment types.
> > >
> > > Signed-off-by: KP Singh <[email protected]>
> > > ---
> > > arch/x86/net/bpf_jit_comp.c | 31 +++++++++---------
> > > include/linux/bpf.h | 13 ++++++--
> > > kernel/bpf/bpf_struct_ops.c | 13 +++++++-
> > > kernel/bpf/trampoline.c | 63 +++++++++++++++++++++----------------
> > > 4 files changed, 75 insertions(+), 45 deletions(-)
> > >
> > > diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c
> > > index 9ba08e9abc09..15c7d28bc05c 100644
> > > --- a/arch/x86/net/bpf_jit_comp.c
> > > +++ b/arch/x86/net/bpf_jit_comp.c
> > > @@ -1362,12 +1362,12 @@ static void restore_regs(const struct btf_func_model *m, u8 **prog, int nr_args,
> > > }
> > >
> > > static int invoke_bpf(const struct btf_func_model *m, u8 **pprog,
> > > - struct bpf_prog **progs, int prog_cnt, int stack_size)
> > > + struct bpf_tramp_progs *tp, int stack_size)
> >
> > nit: it's `tp` here, but `tprogs` in arch_prepare_bpf_trampoline. It's
> > minor, but would be nice to stick to consistent naming.
>
> I did this to ~distinguish~ that rather than being an array of
> tprogs it's a pointer to one of its members e.g.
> &tprogs[BPF_TRAMP_FEXIT]).
>
> I change it if you feel this is not a valuable disntinction.

I think it's important distinction, but naming doesn't really help
with it... Not sure how you can make it more clear, though.
>
> >
> > > {
> > > u8 *prog = *pprog;
> > > int cnt = 0, i;
> > >
> > > - for (i = 0; i < prog_cnt; i++) {
> > > + for (i = 0; i < tp->nr_progs; i++) {
> > > if (emit_call(&prog, __bpf_prog_enter, prog))
> > > return -EINVAL;
> > > /* remember prog start time returned by __bpf_prog_enter */
> > > @@ -1376,17 +1376,17 @@ static int invoke_bpf(const struct btf_func_model *m, u8 **pprog,
> > > /* arg1: lea rdi, [rbp - stack_size] */
> > > EMIT4(0x48, 0x8D, 0x7D, -stack_size);
> > > /* arg2: progs[i]->insnsi for interpreter */
> > > - if (!progs[i]->jited)
> > > + if (!tp->progs[i]->jited)
> > > emit_mov_imm64(&prog, BPF_REG_2,
> > > - (long) progs[i]->insnsi >> 32,
> > > - (u32) (long) progs[i]->insnsi);
> > > + (long) tp->progs[i]->insnsi >> 32,
> > > + (u32) (long) tp->progs[i]->insnsi);
> > > /* call JITed bpf program or interpreter */
> > > - if (emit_call(&prog, progs[i]->bpf_func, prog))
> > > + if (emit_call(&prog, tp->progs[i]->bpf_func, prog))
> > > return -EINVAL;
> > >
> > > /* arg1: mov rdi, progs[i] */
> > > - emit_mov_imm64(&prog, BPF_REG_1, (long) progs[i] >> 32,
> > > - (u32) (long) progs[i]);
> > > + emit_mov_imm64(&prog, BPF_REG_1, (long) tp->progs[i] >> 32,
> > > + (u32) (long) tp->progs[i]);
> > > /* arg2: mov rsi, rbx <- start time in nsec */
> > > emit_mov_reg(&prog, true, BPF_REG_2, BPF_REG_6);
> > > if (emit_call(&prog, __bpf_prog_exit, prog))
> > > @@ -1458,12 +1458,13 @@ static int invoke_bpf(const struct btf_func_model *m, u8 **pprog,
> > > */
> > > int arch_prepare_bpf_trampoline(void *image, void *image_end,
> > > const struct btf_func_model *m, u32 flags,
> > > - struct bpf_prog **fentry_progs, int fentry_cnt,
> > > - struct bpf_prog **fexit_progs, int fexit_cnt,
> > > + struct bpf_tramp_progs *tprogs,
> > > void *orig_call)
> > > {
> > > int cnt = 0, nr_args = m->nr_args;
> > > int stack_size = nr_args * 8;
> > > + struct bpf_tramp_progs *fentry = &tprogs[BPF_TRAMP_FENTRY];
> > > + struct bpf_tramp_progs *fexit = &tprogs[BPF_TRAMP_FEXIT];
> > > u8 *prog;
> > >
> > > /* x86-64 supports up to 6 arguments. 7+ can be added in the future */
> >
> > [...]
> >
> > > diff --git a/kernel/bpf/bpf_struct_ops.c b/kernel/bpf/bpf_struct_ops.c
> > > index c498f0fffb40..a011a77b21fa 100644
> > > --- a/kernel/bpf/bpf_struct_ops.c
> > > +++ b/kernel/bpf/bpf_struct_ops.c
> > > @@ -320,6 +320,7 @@ static int bpf_struct_ops_map_update_elem(struct bpf_map *map, void *key,
> > > struct bpf_struct_ops_value *uvalue, *kvalue;
> > > const struct btf_member *member;
> > > const struct btf_type *t = st_ops->type;
> > > + struct bpf_tramp_progs *tprogs = NULL;
> > > void *udata, *kdata;
> > > int prog_fd, err = 0;
> > > void *image;
> > > @@ -425,10 +426,19 @@ static int bpf_struct_ops_map_update_elem(struct bpf_map *map, void *key,
> > > goto reset_unlock;
> > > }
> > >
> > > + tprogs = kcalloc(BPF_TRAMP_MAX, sizeof(struct bpf_tramp_progs),
> >
> > nit: sizeof(*tprogs) ?
>
> Sure. I can fix it.
>
> >
> > > + GFP_KERNEL);
> > > + if (!tprogs) {
> > > + err = -ENOMEM;
> > > + goto reset_unlock;
> > > + }
> > > +
> > > + *tprogs[BPF_TRAMP_FENTRY].progs = prog;
> >
> > I'm very confused what's going on here, why * at the beginning here,
> > but no * below?.. It seems unnecessary.
>
> The progs member of bpf_tramp_progs is
>
> struct bpf_tramp_progs {
> struct bpf_prog *progs[BPF_MAX_TRAMP_PROGS];
> int nr_progs;
> };
>
> Equivalent to the **progs we had before in the signature of
> arch_prepare_bpf_trampoline.
>
> *tprogs[BPF_TRAMP_FENTRY].progs = prog;
>
> is setting the program in the **progs. The one below is setting the
> count. Am I missing something :)

Ok, so it's setting entry 0 in bpf_tramp_progs->progs array, right?
Wouldn't it be less mind-bending and confusing written this way:

tprogs[BPF_TRAMP_FENTRY].progs[0] = prog;

?

Syntax you used treats fixed-length progs array as a pointer, which is
valid C, but not the best C either.

>
> >
> > > + tprogs[BPF_TRAMP_FENTRY].nr_progs = 1;
> > > err = arch_prepare_bpf_trampoline(image,
> > > st_map->image + PAGE_SIZE,
> > > &st_ops->func_models[i], 0,
> > > - &prog, 1, NULL, 0, NULL);
> > > + tprogs, NULL);
> > > if (err < 0)
> > > goto reset_unlock;
> > >
> > > @@ -469,6 +479,7 @@ static int bpf_struct_ops_map_update_elem(struct bpf_map *map, void *key,
> > > memset(uvalue, 0, map->value_size);
> > > memset(kvalue, 0, map->value_size);
> > > unlock:
> > > + kfree(tprogs);
> > > mutex_unlock(&st_map->lock);
> > > return err;
> > > }
> > > diff --git a/kernel/bpf/trampoline.c b/kernel/bpf/trampoline.c
> > > index 704fa787fec0..9daeb094f054 100644
> > > --- a/kernel/bpf/trampoline.c
> > > +++ b/kernel/bpf/trampoline.c
> > > @@ -190,40 +190,50 @@ static int register_fentry(struct bpf_trampoline *tr, void *new_addr)
> > > return ret;
> > > }
> > >
> > > -/* Each call __bpf_prog_enter + call bpf_func + call __bpf_prog_exit is ~50
> > > - * bytes on x86. Pick a number to fit into BPF_IMAGE_SIZE / 2
> > > - */
> > > -#define BPF_MAX_TRAMP_PROGS 40
> > > +static struct bpf_tramp_progs *
> > > +bpf_trampoline_update_progs(struct bpf_trampoline *tr, int *total)
> > > +{
> > > + struct bpf_tramp_progs *tprogs;
> > > + struct bpf_prog **progs;
> > > + struct bpf_prog_aux *aux;
> > > + int kind;
> > > +
> > > + *total = 0;
> > > + tprogs = kcalloc(BPF_TRAMP_MAX, sizeof(struct bpf_tramp_progs),
> >
> > same nit as above, sizeof(*tprogs) is shorter and less error-prone
>
> Sure, can fix it.
>
> - KP
>
> >
> > > + GFP_KERNEL);
> > > + if (!tprogs)
> > > + return ERR_PTR(-ENOMEM);
> > > +
> > > + for (kind = 0; kind < BPF_TRAMP_MAX; kind++) {
> > > + tprogs[kind].nr_progs = tr->progs_cnt[kind];
> > > + *total += tr->progs_cnt[kind];
> > > + progs = tprogs[kind].progs;
> > > +
> > > + hlist_for_each_entry(aux, &tr->progs_hlist[kind], tramp_hlist)
> > > + *progs++ = aux->prog;
> > > + }
> > > + return tprogs;
> > > +}
> > >
> >
> > [...]

2020-03-03 23:09:05

by KP Singh

[permalink] [raw]
Subject: Re: [PATCH bpf-next 1/7] bpf: Refactor trampoline update code

On 03-M?r 15:03, Andrii Nakryiko wrote:
> On Tue, Mar 3, 2020 at 2:24 PM KP Singh <[email protected]> wrote:
> >
> > On 03-M?r 14:12, Andrii Nakryiko wrote:
> > > On Tue, Mar 3, 2020 at 6:13 AM KP Singh <[email protected]> wrote:
> > > >
> > > > From: KP Singh <[email protected]>
> > > >
> > > > As we need to introduce a third type of attachment for trampolines, the
> > > > flattened signature of arch_prepare_bpf_trampoline gets even more
> > > > complicated.
> > > >
> > > > Refactor the prog and count argument to arch_prepare_bpf_trampoline to
> > > > use bpf_tramp_progs to simplify the addition and accounting for new
> > > > attachment types.
> > > >
> > > > Signed-off-by: KP Singh <[email protected]>
> > > > ---
> > > > arch/x86/net/bpf_jit_comp.c | 31 +++++++++---------
> > > > include/linux/bpf.h | 13 ++++++--
> > > > kernel/bpf/bpf_struct_ops.c | 13 +++++++-
> > > > kernel/bpf/trampoline.c | 63 +++++++++++++++++++++----------------
> > > > 4 files changed, 75 insertions(+), 45 deletions(-)
> > > >
> > > > diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c
> > > > index 9ba08e9abc09..15c7d28bc05c 100644
> > > > --- a/arch/x86/net/bpf_jit_comp.c
> > > > +++ b/arch/x86/net/bpf_jit_comp.c
> > > > @@ -1362,12 +1362,12 @@ static void restore_regs(const struct btf_func_model *m, u8 **prog, int nr_args,
> > > > }
> > > >
> > > > static int invoke_bpf(const struct btf_func_model *m, u8 **pprog,
> > > > - struct bpf_prog **progs, int prog_cnt, int stack_size)
> > > > + struct bpf_tramp_progs *tp, int stack_size)
> > >
> > > nit: it's `tp` here, but `tprogs` in arch_prepare_bpf_trampoline. It's
> > > minor, but would be nice to stick to consistent naming.
> >
> > I did this to ~distinguish~ that rather than being an array of
> > tprogs it's a pointer to one of its members e.g.
> > &tprogs[BPF_TRAMP_FEXIT]).
> >
> > I change it if you feel this is not a valuable disntinction.
>
> I think it's important distinction, but naming doesn't really help
> with it... Not sure how you can make it more clear, though.

I would prefer to keep the naming distinction. Hope that's okay with
you.

> >

[...]


> > >
> > count. Am I missing something :)
>
> Ok, so it's setting entry 0 in bpf_tramp_progs->progs array, right?
> Wouldn't it be less mind-bending and confusing written this way:
>
> tprogs[BPF_TRAMP_FENTRY].progs[0] = prog;

Definitely much cleaner/less mind bending :) Updated. Thanks!

- KP

>
> ?
>
> Syntax you used treats fixed-length progs array as a pointer, which is
> valid C, but not the best C either.
>

[...]

> >
> > > [...]

2020-03-03 23:22:20

by KP Singh

[permalink] [raw]
Subject: Re: [PATCH bpf-next 4/7] bpf: Attachment verification for BPF_MODIFY_RETURN

On 03-M?r 14:44, Andrii Nakryiko wrote:
> On Tue, Mar 3, 2020 at 6:12 AM KP Singh <[email protected]> wrote:
> >
> > From: KP Singh <[email protected]>
> >
> > - Functions that are whitlisted by for error injection i.e.
> > within_error_injection_list.
> > - Security hooks, this is expected to be cleaned up after the KRSI
> > patches introduce the LSM_HOOK macro:
> >
> > https://lore.kernel.org/bpf/[email protected]/
>
> Commit message can use a bit more work for sure. Why (and even what)
> of the changes is not really explained well.

Added some more details.

>
> >
> > - The attachment is currently limited to functions that return an int.
> > This can be extended later other types (e.g. PTR).
> >
> > Signed-off-by: KP Singh <[email protected]>
> > ---
> > kernel/bpf/btf.c | 28 ++++++++++++++++++++--------
> > kernel/bpf/verifier.c | 31 +++++++++++++++++++++++++++++++
> > 2 files changed, 51 insertions(+), 8 deletions(-)
> >

[...]

> > + t = btf_type_skip_modifiers(btf, t->type, NULL);
> > + if (!btf_type_is_int(t)) {
>
> Should the size of int be verified here? E.g., if some function
> returns u8, is that ok for BPF program to return, say, (1<<30) ?

Would this work?

if (size != t->size) {
bpf_log(log,
"size accessed = %d should be %d\n",
size, t->size);
return false;
}

- KP

>
> > + bpf_log(log,
> > + "ret type %s not allowed for fmod_ret\n",
> > + btf_kind_str[BTF_INFO_KIND(t->info)]);
> > + return false;
> > + }
> > + }
> > } else if (arg >= nr_args) {
> > bpf_log(log, "func '%s' doesn't have %d-th argument\n",
> > tname, arg + 1);
> > diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> > index 2460c8e6b5be..ae32517d4ccd 100644
> > --- a/kernel/bpf/verifier.c
> > +++ b/kernel/bpf/verifier.c
> > @@ -19,6 +19,7 @@
> > #include <linux/sort.h>
> > #include <linux/perf_event.h>
> > #include <linux/ctype.h>
> > +#include <linux/error-injection.h>
> >
> > #include "disasm.h"
> >
> > @@ -9800,6 +9801,33 @@ static int check_struct_ops_btf_id(struct bpf_verifier_env *env)
> >
> > return 0;
> > }
> > +#define SECURITY_PREFIX "security_"
> > +
> > +static int check_attach_modify_return(struct bpf_verifier_env *env)
> > +{
> > + struct bpf_prog *prog = env->prog;
> > + unsigned long addr = (unsigned long) prog->aux->trampoline->func.addr;
> > +
> > + if (within_error_injection_list(addr))
> > + return 0;
> > +
> > + /* This is expected to be cleaned up in the future with the KRSI effort
> > + * introducing the LSM_HOOK macro for cleaning up lsm_hooks.h.
> > + */
> > + if (!strncmp(SECURITY_PREFIX, prog->aux->attach_func_name,
> > + sizeof(SECURITY_PREFIX) - 1)) {
> > +
> > + if (!capable(CAP_MAC_ADMIN))
> > + return -EPERM;
> > +
> > + return 0;
> > + }
> > +
> > + verbose(env, "fmod_ret attach_btf_id %u (%s) is not modifiable\n",
> > + prog->aux->attach_btf_id, prog->aux->attach_func_name);
> > +
> > + return -EINVAL;
> > +}
> >
> > static int check_attach_btf_id(struct bpf_verifier_env *env)
> > {
> > @@ -10000,6 +10028,9 @@ static int check_attach_btf_id(struct bpf_verifier_env *env)
> > }
> > tr->func.addr = (void *)addr;
> > prog->aux->trampoline = tr;
> > +
> > + if (prog->expected_attach_type == BPF_MODIFY_RETURN)
> > + ret = check_attach_modify_return(env);
> > out:
> > mutex_unlock(&tr->mutex);
> > if (ret)
> > --
> > 2.20.1
> >

2020-03-03 23:56:45

by Alexei Starovoitov

[permalink] [raw]
Subject: Re: [PATCH bpf-next 2/7] bpf: JIT helpers for fmod_ret progs

On Tue, Mar 03, 2020 at 11:28:12PM +0100, KP Singh wrote:
> > > +static void align16_branch_target(u8 **pprog)
> > > +{
> > > + u8 *target, *prog = *pprog;
> > > +
> > > + target = PTR_ALIGN(prog, 16);
> > > + if (target != prog)
> > > + emit_nops(&prog, target - prog);
> > > + if (target != prog)
> > > + pr_err("calcultion error\n");
> >
> > this wasn't in the original code, do you feel like it's more important
> > to check this and print error?
> >
> > also typo: calculation error, but then it's a bit brief and
> > uninformative message. So I don't know, maybe just drop it?
>
> Ah, good catch! this is deinitely not intended to be here.
> It's a debug artifact and needs to dropped indeed.

That spurious pr_err() caught my attention as well.
After further analysis there is a bug here.
The function is missing last line:
*pprog = prog;
Without it the nop insertion is actually not happenning.
Nops are being written, but next insns will overwrite them.
When I noticed it by code review I applied the patches to my tree
and run the tests and, as expected, all tests passed.
The existing test_xdp_veth.sh emits the most amount of unaligned
branches. Since then I've been thinking whether we could add a test
to catch things like this and couldn't come up with a way to test it
without burning a lot of code on it. So let's fix it and move on.
Could you rename this helper? May be emit_align() and pass 16 into it?
The code is not branch target specific. It's aligning the start
of the next instruction.
Also could you add a comment to:
align16_branch_target(&prog);
for (i = 0; i < fmod_ret->nr_progs; i++)
emit_cond_near_jump(&branches[i], prog, branches[i],
X86_JNE);
kfree(branches);
to say that the loop is updating prior location to jump to aligned
branch target ?

2020-03-04 00:03:53

by Alexei Starovoitov

[permalink] [raw]
Subject: Re: [PATCH bpf-next 4/7] bpf: Attachment verification for BPF_MODIFY_RETURN

On Wed, Mar 04, 2020 at 12:21:51AM +0100, KP Singh wrote:
>
> > > + t = btf_type_skip_modifiers(btf, t->type, NULL);
> > > + if (!btf_type_is_int(t)) {
> >
> > Should the size of int be verified here? E.g., if some function
> > returns u8, is that ok for BPF program to return, say, (1<<30) ?
>
> Would this work?
>
> if (size != t->size) {
> bpf_log(log,
> "size accessed = %d should be %d\n",
> size, t->size);
> return false;
> }

It will cause spurious failures later when llvm optimizes
if (ret & 0xff) into u8 load.
I think btf_type_is_int() is enough as-is.

2020-03-04 01:06:44

by KP Singh

[permalink] [raw]
Subject: Re: [PATCH bpf-next 4/7] bpf: Attachment verification for BPF_MODIFY_RETURN

On 03-M?r 16:03, Alexei Starovoitov wrote:
> On Wed, Mar 04, 2020 at 12:21:51AM +0100, KP Singh wrote:
> >
> > > > + t = btf_type_skip_modifiers(btf, t->type, NULL);
> > > > + if (!btf_type_is_int(t)) {
> > >
> > > Should the size of int be verified here? E.g., if some function
> > > returns u8, is that ok for BPF program to return, say, (1<<30) ?
> >
> > Would this work?
> >
> > if (size != t->size) {
> > bpf_log(log,
> > "size accessed = %d should be %d\n",
> > size, t->size);
> > return false;
> > }
>
> It will cause spurious failures later when llvm optimizes
> if (ret & 0xff) into u8 load.
> I think btf_type_is_int() is enough as-is.

Okay skipping the size check.

- KP

2020-03-04 01:26:33

by KP Singh

[permalink] [raw]
Subject: Re: [PATCH bpf-next 2/7] bpf: JIT helpers for fmod_ret progs

On 03-M?r 15:56, Alexei Starovoitov wrote:
> On Tue, Mar 03, 2020 at 11:28:12PM +0100, KP Singh wrote:
> > > > +static void align16_branch_target(u8 **pprog)
> > > > +{
> > > > + u8 *target, *prog = *pprog;
> > > > +
> > > > + target = PTR_ALIGN(prog, 16);
> > > > + if (target != prog)
> > > > + emit_nops(&prog, target - prog);
> > > > + if (target != prog)
> > > > + pr_err("calcultion error\n");
> > >
> > > this wasn't in the original code, do you feel like it's more important
> > > to check this and print error?
> > >
> > > also typo: calculation error, but then it's a bit brief and
> > > uninformative message. So I don't know, maybe just drop it?
> >
> > Ah, good catch! this is deinitely not intended to be here.
> > It's a debug artifact and needs to dropped indeed.
>
> That spurious pr_err() caught my attention as well.
> After further analysis there is a bug here.
> The function is missing last line:
> *pprog = prog;

Great catch! Fixed.

> Without it the nop insertion is actually not happenning.
> Nops are being written, but next insns will overwrite them.
> When I noticed it by code review I applied the patches to my tree
> and run the tests and, as expected, all tests passed.
> The existing test_xdp_veth.sh emits the most amount of unaligned
> branches. Since then I've been thinking whether we could add a test
> to catch things like this and couldn't come up with a way to test it
> without burning a lot of code on it. So let's fix it and move on.
> Could you rename this helper? May be emit_align() and pass 16 into it?

Seems reasonable. Done.

> The code is not branch target specific. It's aligning the start
> of the next instruction.
> Also could you add a comment to:

Done. Sending v2 out.

- KP

> align16_branch_target(&prog);
> for (i = 0; i < fmod_ret->nr_progs; i++)
> emit_cond_near_jump(&branches[i], prog, branches[i],
> X86_JNE);
> kfree(branches);
> to say that the loop is updating prior location to jump to aligned
> branch target ?