2023-04-03 11:26:10

by Florent Revest

[permalink] [raw]
Subject: [PATCH v4 0/4] Add ftrace direct call for arm64

This series adds ftrace direct call support to arm64.
This makes BPF tracing programs (fentry/fexit/fmod_ret/lsm) work on arm64.

It is meant to be taken by the arm64 tree but it depends on the
trace-direct-v6.3-rc3 tag of the linux-trace tree:
git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace.git
That tag was created by Steven Rostedt so the arm64 tree can pull the prior work
this depends on. [1]

Thanks to the ftrace refactoring under that tag, an ftrace_ops backing a ftrace
direct call will only ever point to *one* direct call. This means we can look up
the direct called trampoline address stored in the ops from the ftrace_caller
trampoline in the case when the destination would be out of reach of a BL
instruction at the ftrace callsite. This fixes limitations of previous attempts
such as [2].

This series has been tested on arm64 with:
1- CONFIG_FTRACE_SELFTEST
2- samples/ftrace/*.ko (cf: patch 3)
3- tools/testing/selftests/bpf/test_progs (cf: patch 4)

Changes since v3 [3]:
- Added "BTI C" instructions at the beginning of each ftrace direct call sample
- Added Mark Rutland's Acked-by to patch 2

1: https://lore.kernel.org/all/ZB2Nl7fzpHoq5V20@FVFF77S0Q05N/
2: https://lore.kernel.org/all/[email protected]/
3: https://lore.kernel.org/bpf/[email protected]/

Florent Revest (4):
arm64: ftrace: Add direct call support
arm64: ftrace: Simplify get_ftrace_plt
arm64: ftrace: Add direct call trampoline samples support
selftests/bpf: Update the tests deny list on aarch64

arch/arm64/Kconfig | 6 ++
arch/arm64/include/asm/ftrace.h | 22 +++++
arch/arm64/kernel/asm-offsets.c | 6 ++
arch/arm64/kernel/entry-ftrace.S | 90 ++++++++++++++++----
arch/arm64/kernel/ftrace.c | 46 +++++++---
samples/ftrace/ftrace-direct-modify.c | 32 +++++++
samples/ftrace/ftrace-direct-multi-modify.c | 36 ++++++++
samples/ftrace/ftrace-direct-multi.c | 22 +++++
samples/ftrace/ftrace-direct-too.c | 25 ++++++
samples/ftrace/ftrace-direct.c | 23 +++++
tools/testing/selftests/bpf/DENYLIST.aarch64 | 82 ++----------------
11 files changed, 288 insertions(+), 102 deletions(-)

--
2.40.0.423.gd6c402a77b-goog


2023-04-03 11:26:24

by Florent Revest

[permalink] [raw]
Subject: [PATCH v4 1/4] arm64: ftrace: Add direct call support

This builds up on the CALL_OPS work which extends the ftrace patchsite
on arm64 with an ops pointer usable by the ftrace trampoline.

This ops pointer is valid at all time. Indeed, it is either pointing to
ftrace_list_ops or to the single ops which should be called from that
patchsite.

There are a few cases to distinguish:
- If a direct call ops is the only one tracing a function:
- If the direct called trampoline is within the reach of a BL
instruction
-> the ftrace patchsite jumps to the trampoline
- Else
-> the ftrace patchsite jumps to the ftrace_caller trampoline which
reads the ops pointer in the patchsite and jumps to the direct
call address stored in the ops
- Else
-> the ftrace patchsite jumps to the ftrace_caller trampoline and its
ops literal points to ftrace_list_ops so it iterates over all
registered ftrace ops, including the direct call ops and calls its
call_direct_funcs handler which stores the direct called
trampoline's address in the ftrace_regs and the ftrace_caller
trampoline will return to that address instead of returning to the
traced function

Signed-off-by: Florent Revest <[email protected]>
Co-developed-by: Mark Rutland <[email protected]>
Signed-off-by: Mark Rutland <[email protected]>
---
arch/arm64/Kconfig | 4 ++
arch/arm64/include/asm/ftrace.h | 22 ++++++++
arch/arm64/kernel/asm-offsets.c | 6 +++
arch/arm64/kernel/entry-ftrace.S | 90 ++++++++++++++++++++++++++------
arch/arm64/kernel/ftrace.c | 36 +++++++++++--
5 files changed, 138 insertions(+), 20 deletions(-)

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 1023e896d46b..f3503d0cc1b8 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -185,6 +185,10 @@ config ARM64
select HAVE_DEBUG_KMEMLEAK
select HAVE_DMA_CONTIGUOUS
select HAVE_DYNAMIC_FTRACE
+ select HAVE_DYNAMIC_FTRACE_WITH_ARGS \
+ if $(cc-option,-fpatchable-function-entry=2)
+ select HAVE_DYNAMIC_FTRACE_WITH_DIRECT_CALLS \
+ if DYNAMIC_FTRACE_WITH_ARGS && DYNAMIC_FTRACE_WITH_CALL_OPS
select HAVE_DYNAMIC_FTRACE_WITH_CALL_OPS \
if (DYNAMIC_FTRACE_WITH_ARGS && !CFI_CLANG && \
!CC_OPTIMIZE_FOR_SIZE)
diff --git a/arch/arm64/include/asm/ftrace.h b/arch/arm64/include/asm/ftrace.h
index 1c2672bbbf37..b87d70b693c6 100644
--- a/arch/arm64/include/asm/ftrace.h
+++ b/arch/arm64/include/asm/ftrace.h
@@ -70,10 +70,19 @@ struct ftrace_ops;

#define arch_ftrace_get_regs(regs) NULL

+/*
+ * Note: sizeof(struct ftrace_regs) must be a multiple of 16 to ensure correct
+ * stack alignment
+ */
struct ftrace_regs {
/* x0 - x8 */
unsigned long regs[9];
+
+#ifdef CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS
+ unsigned long direct_tramp;
+#else
unsigned long __unused;
+#endif

unsigned long fp;
unsigned long lr;
@@ -136,6 +145,19 @@ int ftrace_init_nop(struct module *mod, struct dyn_ftrace *rec);
void ftrace_graph_func(unsigned long ip, unsigned long parent_ip,
struct ftrace_ops *op, struct ftrace_regs *fregs);
#define ftrace_graph_func ftrace_graph_func
+
+#ifdef CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS
+static inline void arch_ftrace_set_direct_caller(struct ftrace_regs *fregs,
+ unsigned long addr)
+{
+ /*
+ * The ftrace trampoline will return to this address instead of the
+ * instrumented function.
+ */
+ fregs->direct_tramp = addr;
+}
+#endif /* CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS */
+
#endif

#define ftrace_return_address(n) return_address(n)
diff --git a/arch/arm64/kernel/asm-offsets.c b/arch/arm64/kernel/asm-offsets.c
index ae345b06e9f7..0996094b0d22 100644
--- a/arch/arm64/kernel/asm-offsets.c
+++ b/arch/arm64/kernel/asm-offsets.c
@@ -93,6 +93,9 @@ int main(void)
DEFINE(FREGS_LR, offsetof(struct ftrace_regs, lr));
DEFINE(FREGS_SP, offsetof(struct ftrace_regs, sp));
DEFINE(FREGS_PC, offsetof(struct ftrace_regs, pc));
+#ifdef CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS
+ DEFINE(FREGS_DIRECT_TRAMP, offsetof(struct ftrace_regs, direct_tramp));
+#endif
DEFINE(FREGS_SIZE, sizeof(struct ftrace_regs));
BLANK();
#endif
@@ -197,6 +200,9 @@ int main(void)
#endif
#ifdef CONFIG_FUNCTION_TRACER
DEFINE(FTRACE_OPS_FUNC, offsetof(struct ftrace_ops, func));
+#ifdef CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS
+ DEFINE(FTRACE_OPS_DIRECT_CALL, offsetof(struct ftrace_ops, direct_call));
+#endif
#endif
return 0;
}
diff --git a/arch/arm64/kernel/entry-ftrace.S b/arch/arm64/kernel/entry-ftrace.S
index 350ed81324ac..1c38a60575aa 100644
--- a/arch/arm64/kernel/entry-ftrace.S
+++ b/arch/arm64/kernel/entry-ftrace.S
@@ -36,6 +36,31 @@
SYM_CODE_START(ftrace_caller)
bti c

+#ifdef CONFIG_DYNAMIC_FTRACE_WITH_CALL_OPS
+ /*
+ * The literal pointer to the ops is at an 8-byte aligned boundary
+ * which is either 12 or 16 bytes before the BL instruction in the call
+ * site. See ftrace_call_adjust() for details.
+ *
+ * Therefore here the LR points at `literal + 16` or `literal + 20`,
+ * and we can find the address of the literal in either case by
+ * aligning to an 8-byte boundary and subtracting 16. We do the
+ * alignment first as this allows us to fold the subtraction into the
+ * LDR.
+ */
+ bic x11, x30, 0x7
+ ldr x11, [x11, #-(4 * AARCH64_INSN_SIZE)] // op
+
+#ifdef CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS
+ /*
+ * If the op has a direct call, handle it immediately without
+ * saving/restoring registers.
+ */
+ ldr x17, [x11, #FTRACE_OPS_DIRECT_CALL] // op->direct_call
+ cbnz x17, ftrace_caller_direct
+#endif
+#endif
+
/* Save original SP */
mov x10, sp

@@ -49,6 +74,10 @@ SYM_CODE_START(ftrace_caller)
stp x6, x7, [sp, #FREGS_X6]
str x8, [sp, #FREGS_X8]

+#ifdef CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS
+ str xzr, [sp, #FREGS_DIRECT_TRAMP]
+#endif
+
/* Save the callsite's FP, LR, SP */
str x29, [sp, #FREGS_FP]
str x9, [sp, #FREGS_LR]
@@ -71,20 +100,7 @@ SYM_CODE_START(ftrace_caller)
mov x3, sp // regs

#ifdef CONFIG_DYNAMIC_FTRACE_WITH_CALL_OPS
- /*
- * The literal pointer to the ops is at an 8-byte aligned boundary
- * which is either 12 or 16 bytes before the BL instruction in the call
- * site. See ftrace_call_adjust() for details.
- *
- * Therefore here the LR points at `literal + 16` or `literal + 20`,
- * and we can find the address of the literal in either case by
- * aligning to an 8-byte boundary and subtracting 16. We do the
- * alignment first as this allows us to fold the subtraction into the
- * LDR.
- */
- bic x2, x30, 0x7
- ldr x2, [x2, #-16] // op
-
+ mov x2, x11 // op
ldr x4, [x2, #FTRACE_OPS_FUNC] // op->func
blr x4 // op->func(ip, parent_ip, op, regs)

@@ -107,8 +123,15 @@ SYM_INNER_LABEL(ftrace_call, SYM_L_GLOBAL)
ldp x6, x7, [sp, #FREGS_X6]
ldr x8, [sp, #FREGS_X8]

- /* Restore the callsite's FP, LR, PC */
+ /* Restore the callsite's FP */
ldr x29, [sp, #FREGS_FP]
+
+#ifdef CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS
+ ldr x17, [sp, #FREGS_DIRECT_TRAMP]
+ cbnz x17, ftrace_caller_direct_late
+#endif
+
+ /* Restore the callsite's LR and PC */
ldr x30, [sp, #FREGS_LR]
ldr x9, [sp, #FREGS_PC]

@@ -116,8 +139,45 @@ SYM_INNER_LABEL(ftrace_call, SYM_L_GLOBAL)
add sp, sp, #FREGS_SIZE + 32

ret x9
+
+#ifdef CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS
+SYM_INNER_LABEL(ftrace_caller_direct_late, SYM_L_LOCAL)
+ /*
+ * Head to a direct trampoline in x17 after having run other tracers.
+ * The ftrace_regs are live, and x0-x8 and FP have been restored. The
+ * LR, PC, and SP have not been restored.
+ */
+
+ /*
+ * Restore the callsite's LR and PC matching the trampoline calling
+ * convention.
+ */
+ ldr x9, [sp, #FREGS_LR]
+ ldr x30, [sp, #FREGS_PC]
+
+ /* Restore the callsite's SP */
+ add sp, sp, #FREGS_SIZE + 32
+
+SYM_INNER_LABEL(ftrace_caller_direct, SYM_L_LOCAL)
+ /*
+ * Head to a direct trampoline in x17.
+ *
+ * We use `BR X17` as this can safely land on a `BTI C` or `PACIASP` in
+ * the trampoline, and will not unbalance any return stack.
+ */
+ br x17
+#endif /* CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS */
SYM_CODE_END(ftrace_caller)

+#ifdef CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS
+SYM_CODE_START(ftrace_stub_direct_tramp)
+ bti c
+ mov x10, x30
+ mov x30, x9
+ ret x10
+SYM_CODE_END(ftrace_stub_direct_tramp)
+#endif /* CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS */
+
#else /* CONFIG_DYNAMIC_FTRACE_WITH_ARGS */

/*
diff --git a/arch/arm64/kernel/ftrace.c b/arch/arm64/kernel/ftrace.c
index 5545fe1a9012..758436727fba 100644
--- a/arch/arm64/kernel/ftrace.c
+++ b/arch/arm64/kernel/ftrace.c
@@ -206,6 +206,13 @@ static struct plt_entry *get_ftrace_plt(struct module *mod, unsigned long addr)
return NULL;
}

+static bool reachable_by_bl(unsigned long addr, unsigned long pc)
+{
+ long offset = (long)addr - (long)pc;
+
+ return offset >= -SZ_128M && offset < SZ_128M;
+}
+
/*
* Find the address the callsite must branch to in order to reach '*addr'.
*
@@ -220,14 +227,21 @@ static bool ftrace_find_callable_addr(struct dyn_ftrace *rec,
unsigned long *addr)
{
unsigned long pc = rec->ip;
- long offset = (long)*addr - (long)pc;
struct plt_entry *plt;

+ /*
+ * If a custom trampoline is unreachable, rely on the ftrace_caller
+ * trampoline which knows how to indirectly reach that trampoline
+ * through ops->direct_call.
+ */
+ if (*addr != FTRACE_ADDR && !reachable_by_bl(*addr, pc))
+ *addr = FTRACE_ADDR;
+
/*
* When the target is within range of the 'BL' instruction, use 'addr'
* as-is and branch to that directly.
*/
- if (offset >= -SZ_128M && offset < SZ_128M)
+ if (reachable_by_bl(*addr, pc))
return true;

/*
@@ -330,12 +344,24 @@ int ftrace_make_call(struct dyn_ftrace *rec, unsigned long addr)
int ftrace_modify_call(struct dyn_ftrace *rec, unsigned long old_addr,
unsigned long addr)
{
- if (WARN_ON_ONCE(old_addr != (unsigned long)ftrace_caller))
+ unsigned long pc = rec->ip;
+ u32 old, new;
+ int ret;
+
+ ret = ftrace_rec_set_ops(rec, arm64_rec_get_ops(rec));
+ if (ret)
+ return ret;
+
+ if (!ftrace_find_callable_addr(rec, NULL, &old_addr))
return -EINVAL;
- if (WARN_ON_ONCE(addr != (unsigned long)ftrace_caller))
+ if (!ftrace_find_callable_addr(rec, NULL, &addr))
return -EINVAL;

- return ftrace_rec_update_ops(rec);
+ old = aarch64_insn_gen_branch_imm(pc, old_addr,
+ AARCH64_INSN_BRANCH_LINK);
+ new = aarch64_insn_gen_branch_imm(pc, addr, AARCH64_INSN_BRANCH_LINK);
+
+ return ftrace_modify_code(pc, old, new, true);
}
#endif

--
2.40.0.423.gd6c402a77b-goog

2023-04-03 11:26:41

by Florent Revest

[permalink] [raw]
Subject: [PATCH v4 2/4] arm64: ftrace: Simplify get_ftrace_plt

Following recent refactorings, the get_ftrace_plt function only ever
gets called with addr = FTRACE_ADDR so its code can be simplified to
always return the ftrace trampoline plt.

Signed-off-by: Florent Revest <[email protected]>
Acked-by: Mark Rutland <[email protected]>
---
arch/arm64/kernel/ftrace.c | 10 +++++-----
1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/arch/arm64/kernel/ftrace.c b/arch/arm64/kernel/ftrace.c
index 758436727fba..432626c866a8 100644
--- a/arch/arm64/kernel/ftrace.c
+++ b/arch/arm64/kernel/ftrace.c
@@ -195,15 +195,15 @@ int ftrace_update_ftrace_func(ftrace_func_t func)
return ftrace_modify_code(pc, 0, new, false);
}

-static struct plt_entry *get_ftrace_plt(struct module *mod, unsigned long addr)
+static struct plt_entry *get_ftrace_plt(struct module *mod)
{
#ifdef CONFIG_ARM64_MODULE_PLTS
struct plt_entry *plt = mod->arch.ftrace_trampolines;

- if (addr == FTRACE_ADDR)
- return &plt[FTRACE_PLT_IDX];
-#endif
+ return &plt[FTRACE_PLT_IDX];
+#else
return NULL;
+#endif
}

static bool reachable_by_bl(unsigned long addr, unsigned long pc)
@@ -270,7 +270,7 @@ static bool ftrace_find_callable_addr(struct dyn_ftrace *rec,
if (WARN_ON(!mod))
return false;

- plt = get_ftrace_plt(mod, *addr);
+ plt = get_ftrace_plt(mod);
if (!plt) {
pr_err("ftrace: no module PLT for %ps\n", (void *)*addr);
return false;
--
2.40.0.423.gd6c402a77b-goog

2023-04-03 11:26:52

by Florent Revest

[permalink] [raw]
Subject: [PATCH v4 3/4] arm64: ftrace: Add direct call trampoline samples support

The ftrace samples need per-architecture trampoline implementations
to save and restore argument registers around the calls to
my_direct_func* and to restore polluted registers (eg: x30).

These samples also include <asm/asm-offsets.h> which, on arm64, is not
necessary and redefines previously defined macros (resulting in
warnings) so these includes are guarded by !CONFIG_ARM64.

Signed-off-by: Florent Revest <[email protected]>
---
arch/arm64/Kconfig | 2 ++
samples/ftrace/ftrace-direct-modify.c | 32 ++++++++++++++++++
samples/ftrace/ftrace-direct-multi-modify.c | 36 +++++++++++++++++++++
samples/ftrace/ftrace-direct-multi.c | 22 +++++++++++++
samples/ftrace/ftrace-direct-too.c | 25 ++++++++++++++
samples/ftrace/ftrace-direct.c | 23 +++++++++++++
6 files changed, 140 insertions(+)

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index f3503d0cc1b8..c2bf28099abd 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -194,6 +194,8 @@ config ARM64
!CC_OPTIMIZE_FOR_SIZE)
select FTRACE_MCOUNT_USE_PATCHABLE_FUNCTION_ENTRY \
if DYNAMIC_FTRACE_WITH_ARGS
+ select HAVE_SAMPLE_FTRACE_DIRECT
+ select HAVE_SAMPLE_FTRACE_DIRECT_MULTI
select HAVE_EFFICIENT_UNALIGNED_ACCESS
select HAVE_FAST_GUP
select HAVE_FTRACE_MCOUNT_RECORD
diff --git a/samples/ftrace/ftrace-direct-modify.c b/samples/ftrace/ftrace-direct-modify.c
index 25fba66f61c0..32ed0e1f8699 100644
--- a/samples/ftrace/ftrace-direct-modify.c
+++ b/samples/ftrace/ftrace-direct-modify.c
@@ -2,7 +2,9 @@
#include <linux/module.h>
#include <linux/kthread.h>
#include <linux/ftrace.h>
+#ifndef CONFIG_ARM64
#include <asm/asm-offsets.h>
+#endif

extern void my_direct_func1(void);
extern void my_direct_func2(void);
@@ -96,6 +98,36 @@ asm (

#endif /* CONFIG_S390 */

+#ifdef CONFIG_ARM64
+
+asm (
+" .pushsection .text, \"ax\", @progbits\n"
+" .type my_tramp1, @function\n"
+" .globl my_tramp1\n"
+" my_tramp1:"
+" sub sp, sp, #16\n"
+" stp x9, x30, [sp]\n"
+" bl my_direct_func1\n"
+" ldp x30, x9, [sp]\n"
+" add sp, sp, #16\n"
+" ret x9\n"
+" .size my_tramp1, .-my_tramp1\n"
+
+" .type my_tramp2, @function\n"
+" .globl my_tramp2\n"
+" my_tramp2:"
+" sub sp, sp, #16\n"
+" stp x9, x30, [sp]\n"
+" bl my_direct_func2\n"
+" ldp x30, x9, [sp]\n"
+" add sp, sp, #16\n"
+" ret x9\n"
+" .size my_tramp2, .-my_tramp2\n"
+" .popsection\n"
+);
+
+#endif /* CONFIG_ARM64 */
+
static struct ftrace_ops direct;

static unsigned long my_tramp = (unsigned long)my_tramp1;
diff --git a/samples/ftrace/ftrace-direct-multi-modify.c b/samples/ftrace/ftrace-direct-multi-modify.c
index f72623899602..0ba40891d43e 100644
--- a/samples/ftrace/ftrace-direct-multi-modify.c
+++ b/samples/ftrace/ftrace-direct-multi-modify.c
@@ -2,7 +2,9 @@
#include <linux/module.h>
#include <linux/kthread.h>
#include <linux/ftrace.h>
+#ifndef CONFIG_ARM64
#include <asm/asm-offsets.h>
+#endif

extern void my_direct_func1(unsigned long ip);
extern void my_direct_func2(unsigned long ip);
@@ -103,6 +105,40 @@ asm (

#endif /* CONFIG_S390 */

+#ifdef CONFIG_ARM64
+
+asm (
+" .pushsection .text, \"ax\", @progbits\n"
+" .type my_tramp1, @function\n"
+" .globl my_tramp1\n"
+" my_tramp1:"
+" sub sp, sp, #32\n"
+" stp x9, x30, [sp]\n"
+" str x0, [sp, #16]\n"
+" bl my_direct_func1\n"
+" ldp x30, x9, [sp]\n"
+" ldr x0, [sp, #16]\n"
+" add sp, sp, #32\n"
+" ret x9\n"
+" .size my_tramp1, .-my_tramp1\n"
+
+" .type my_tramp2, @function\n"
+" .globl my_tramp2\n"
+" my_tramp2:"
+" sub sp, sp, #32\n"
+" stp x9, x30, [sp]\n"
+" str x0, [sp, #16]\n"
+" bl my_direct_func2\n"
+" ldp x30, x9, [sp]\n"
+" ldr x0, [sp, #16]\n"
+" add sp, sp, #32\n"
+" ret x9\n"
+" .size my_tramp2, .-my_tramp2\n"
+" .popsection\n"
+);
+
+#endif /* CONFIG_ARM64 */
+
static unsigned long my_tramp = (unsigned long)my_tramp1;
static unsigned long tramps[2] = {
(unsigned long)my_tramp1,
diff --git a/samples/ftrace/ftrace-direct-multi.c b/samples/ftrace/ftrace-direct-multi.c
index 1547c2c6be02..0b072e763c97 100644
--- a/samples/ftrace/ftrace-direct-multi.c
+++ b/samples/ftrace/ftrace-direct-multi.c
@@ -4,7 +4,9 @@
#include <linux/mm.h> /* for handle_mm_fault() */
#include <linux/ftrace.h>
#include <linux/sched/stat.h>
+#ifndef CONFIG_ARM64
#include <asm/asm-offsets.h>
+#endif

extern void my_direct_func(unsigned long ip);

@@ -66,6 +68,26 @@ asm (

#endif /* CONFIG_S390 */

+#ifdef CONFIG_ARM64
+
+asm (
+" .pushsection .text, \"ax\", @progbits\n"
+" .type my_tramp, @function\n"
+" .globl my_tramp\n"
+" my_tramp:"
+" sub sp, sp, #32\n"
+" stp x9, x30, [sp]\n"
+" str x0, [sp, #16]\n"
+" bl my_direct_func\n"
+" ldp x30, x9, [sp]\n"
+" ldr x0, [sp, #16]\n"
+" add sp, sp, #32\n"
+" ret x9\n"
+" .size my_tramp, .-my_tramp\n"
+" .popsection\n"
+);
+
+#endif /* CONFIG_ARM64 */
static struct ftrace_ops direct;

static int __init ftrace_direct_multi_init(void)
diff --git a/samples/ftrace/ftrace-direct-too.c b/samples/ftrace/ftrace-direct-too.c
index f28e7b99840f..5606b7ad1950 100644
--- a/samples/ftrace/ftrace-direct-too.c
+++ b/samples/ftrace/ftrace-direct-too.c
@@ -3,7 +3,9 @@

#include <linux/mm.h> /* for handle_mm_fault() */
#include <linux/ftrace.h>
+#ifndef CONFIG_ARM64
#include <asm/asm-offsets.h>
+#endif

extern void my_direct_func(struct vm_area_struct *vma,
unsigned long address, unsigned int flags);
@@ -70,6 +72,29 @@ asm (

#endif /* CONFIG_S390 */

+#ifdef CONFIG_ARM64
+
+asm (
+" .pushsection .text, \"ax\", @progbits\n"
+" .type my_tramp, @function\n"
+" .globl my_tramp\n"
+" my_tramp:"
+" sub sp, sp, #48\n"
+" stp x9, x30, [sp]\n"
+" stp x0, x1, [sp, #16]\n"
+" str x2, [sp, #32]\n"
+" bl my_direct_func\n"
+" ldp x30, x9, [sp]\n"
+" ldp x0, x1, [sp, #16]\n"
+" ldr x2, [sp, #32]\n"
+" add sp, sp, #48\n"
+" ret x9\n"
+" .size my_tramp, .-my_tramp\n"
+" .popsection\n"
+);
+
+#endif /* CONFIG_ARM64 */
+
static struct ftrace_ops direct;

static int __init ftrace_direct_init(void)
diff --git a/samples/ftrace/ftrace-direct.c b/samples/ftrace/ftrace-direct.c
index d81a9473b585..7e20529ef132 100644
--- a/samples/ftrace/ftrace-direct.c
+++ b/samples/ftrace/ftrace-direct.c
@@ -3,7 +3,9 @@

#include <linux/sched.h> /* for wake_up_process() */
#include <linux/ftrace.h>
+#ifndef CONFIG_ARM64
#include <asm/asm-offsets.h>
+#endif

extern void my_direct_func(struct task_struct *p);

@@ -63,6 +65,27 @@ asm (

#endif /* CONFIG_S390 */

+#ifdef CONFIG_ARM64
+
+asm (
+" .pushsection .text, \"ax\", @progbits\n"
+" .type my_tramp, @function\n"
+" .globl my_tramp\n"
+" my_tramp:"
+" sub sp, sp, #32\n"
+" stp x9, x30, [sp]\n"
+" str x0, [sp, #16]\n"
+" bl my_direct_func\n"
+" ldp x30, x9, [sp]\n"
+" ldr x0, [sp, #16]\n"
+" add sp, sp, #32\n"
+" ret x9\n"
+" .size my_tramp, .-my_tramp\n"
+" .popsection\n"
+);
+
+#endif /* CONFIG_ARM64 */
+
static struct ftrace_ops direct;

static int __init ftrace_direct_init(void)
--
2.40.0.423.gd6c402a77b-goog

2023-04-03 11:27:11

by Florent Revest

[permalink] [raw]
Subject: [PATCH v4 4/4] selftests/bpf: Update the tests deny list on aarch64

Now that ftrace supports direct call on arm64, BPF tracing programs work
on that architecture. This fixes the vast majority of BPF selftests
except for:

- multi_kprobe programs which require fprobe, not available on arm64 yet
- tracing_struct which requires trampoline support to access struct args

This patch updates the list of BPF selftests which are known to fail so
the BPF CI can validate the tests which pass now.

Signed-off-by: Florent Revest <[email protected]>
---
tools/testing/selftests/bpf/DENYLIST.aarch64 | 82 ++------------------
1 file changed, 5 insertions(+), 77 deletions(-)

diff --git a/tools/testing/selftests/bpf/DENYLIST.aarch64 b/tools/testing/selftests/bpf/DENYLIST.aarch64
index 99cc33c51eaa..6b95cb544094 100644
--- a/tools/testing/selftests/bpf/DENYLIST.aarch64
+++ b/tools/testing/selftests/bpf/DENYLIST.aarch64
@@ -1,33 +1,5 @@
-bloom_filter_map # libbpf: prog 'check_bloom': failed to attach: ERROR: strerror_r(-524)=22
-bpf_cookie/lsm
-bpf_cookie/multi_kprobe_attach_api
-bpf_cookie/multi_kprobe_link_api
-bpf_cookie/trampoline
-bpf_loop/check_callback_fn_stop # link unexpected error: -524
-bpf_loop/check_invalid_flags
-bpf_loop/check_nested_calls
-bpf_loop/check_non_constant_callback
-bpf_loop/check_nr_loops
-bpf_loop/check_null_callback_ctx
-bpf_loop/check_stack
-bpf_mod_race # bpf_mod_kfunc_race__attach unexpected error: -524 (errno 524)
-bpf_tcp_ca/dctcp_fallback
-btf_dump/btf_dump: var_data # find type id unexpected find type id: actual -2 < expected 0
-cgroup_hierarchical_stats # attach unexpected error: -524 (errno 524)
-d_path/basic # setup attach failed: -524
-deny_namespace # attach unexpected error: -524 (errno 524)
-fentry_fexit # fentry_attach unexpected error: -1 (errno 524)
-fentry_test # fentry_attach unexpected error: -1 (errno 524)
-fexit_sleep # fexit_attach fexit attach failed: -1
-fexit_stress # fexit attach unexpected fexit attach: actual -524 < expected 0
-fexit_test # fexit_attach unexpected error: -1 (errno 524)
-get_func_args_test # get_func_args_test__attach unexpected error: -524 (errno 524) (trampoline)
-get_func_ip_test # get_func_ip_test__attach unexpected error: -524 (errno 524) (trampoline)
-htab_update/reenter_update
-kfree_skb # attach fentry unexpected error: -524 (trampoline)
-kfunc_call/subprog # extern (var ksym) 'bpf_prog_active': not found in kernel BTF
-kfunc_call/subprog_lskel # skel unexpected error: -2
-kfunc_dynptr_param/dynptr_data_null # libbpf: prog 'dynptr_data_null': failed to attach: ERROR: strerror_r(-524)=22
+bpf_cookie/multi_kprobe_attach_api # kprobe_multi_link_api_subtest:FAIL:fentry_raw_skel_load unexpected error: -3
+bpf_cookie/multi_kprobe_link_api # kprobe_multi_link_api_subtest:FAIL:fentry_raw_skel_load unexpected error: -3
kprobe_multi_bench_attach # bpf_program__attach_kprobe_multi_opts unexpected error: -95
kprobe_multi_test/attach_api_addrs # bpf_program__attach_kprobe_multi_opts unexpected error: -95
kprobe_multi_test/attach_api_pattern # bpf_program__attach_kprobe_multi_opts unexpected error: -95
@@ -35,50 +7,6 @@ kprobe_multi_test/attach_api_syms # bpf_program__attach_kprobe_mu
kprobe_multi_test/bench_attach # bpf_program__attach_kprobe_multi_opts unexpected error: -95
kprobe_multi_test/link_api_addrs # link_fd unexpected link_fd: actual -95 < expected 0
kprobe_multi_test/link_api_syms # link_fd unexpected link_fd: actual -95 < expected 0
-kprobe_multi_test/skel_api # kprobe_multi__attach unexpected error: -524 (errno 524)
-ksyms_module/libbpf # 'bpf_testmod_ksym_percpu': not found in kernel BTF
-ksyms_module/lskel # test_ksyms_module_lskel__open_and_load unexpected error: -2
-libbpf_get_fd_by_id_opts # test_libbpf_get_fd_by_id_opts__attach unexpected error: -524 (errno 524)
-linked_list
-lookup_key # test_lookup_key__attach unexpected error: -524 (errno 524)
-lru_bug # lru_bug__attach unexpected error: -524 (errno 524)
-modify_return # modify_return__attach failed unexpected error: -524 (errno 524)
-module_attach # skel_attach skeleton attach failed: -524
-mptcp/base # run_test mptcp unexpected error: -524 (errno 524)
-netcnt # packets unexpected packets: actual 10001 != expected 10000
-rcu_read_lock # failed to attach: ERROR: strerror_r(-524)=22
-recursion # skel_attach unexpected error: -524 (errno 524)
-ringbuf # skel_attach skeleton attachment failed: -1
-setget_sockopt # attach_cgroup unexpected error: -524
-sk_storage_tracing # test_sk_storage_tracing__attach unexpected error: -524 (errno 524)
-skc_to_unix_sock # could not attach BPF object unexpected error: -524 (errno 524)
-socket_cookie # prog_attach unexpected error: -524
-stacktrace_build_id # compare_stack_ips stackmap vs. stack_amap err -1 errno 2
-task_local_storage/exit_creds # skel_attach unexpected error: -524 (errno 524)
-task_local_storage/recursion # skel_attach unexpected error: -524 (errno 524)
-test_bprm_opts # attach attach failed: -524
-test_ima # attach attach failed: -524
-test_local_storage # attach lsm attach failed: -524
-test_lsm # test_lsm_first_attach unexpected error: -524 (errno 524)
-test_overhead # attach_fentry unexpected error: -524
-timer # timer unexpected error: -524 (errno 524)
-timer_crash # timer_crash__attach unexpected error: -524 (errno 524)
-timer_mim # timer_mim unexpected error: -524 (errno 524)
-trace_printk # trace_printk__attach unexpected error: -1 (errno 524)
-trace_vprintk # trace_vprintk__attach unexpected error: -1 (errno 524)
-tracing_struct # tracing_struct__attach unexpected error: -524 (errno 524)
-trampoline_count # attach_prog unexpected error: -524
-unpriv_bpf_disabled # skel_attach unexpected error: -524 (errno 524)
-user_ringbuf/test_user_ringbuf_post_misaligned # misaligned_skel unexpected error: -524 (errno 524)
-user_ringbuf/test_user_ringbuf_post_producer_wrong_offset
-user_ringbuf/test_user_ringbuf_post_larger_than_ringbuf_sz
-user_ringbuf/test_user_ringbuf_basic # ringbuf_basic_skel unexpected error: -524 (errno 524)
-user_ringbuf/test_user_ringbuf_sample_full_ring_buffer
-user_ringbuf/test_user_ringbuf_post_alignment_autoadjust
-user_ringbuf/test_user_ringbuf_overfill
-user_ringbuf/test_user_ringbuf_discards_properly_ignored
-user_ringbuf/test_user_ringbuf_loop
-user_ringbuf/test_user_ringbuf_msg_protocol
-user_ringbuf/test_user_ringbuf_blocking_reserve
-verify_pkcs7_sig # test_verify_pkcs7_sig__attach unexpected error: -524 (errno 524)
-vmlinux # skel_attach skeleton attach failed: -524
+kprobe_multi_test/skel_api # libbpf: failed to load BPF skeleton 'kprobe_multi': -3
+module_attach # prog 'kprobe_multi': failed to auto-attach: -95
+tracing_struct # tracing_struct__attach unexpected error: -524 (errno 524)
\ No newline at end of file
--
2.40.0.423.gd6c402a77b-goog

2023-04-03 11:28:36

by Florent Revest

[permalink] [raw]
Subject: Re: [PATCH v4 0/4] Add ftrace direct call for arm64

On Mon, Apr 3, 2023 at 1:21 PM Florent Revest <[email protected]> wrote:
>
> This series adds ftrace direct call support to arm64.
> This makes BPF tracing programs (fentry/fexit/fmod_ret/lsm) work on arm64.
>
> It is meant to be taken by the arm64 tree but it depends on the
> trace-direct-v6.3-rc3 tag of the linux-trace tree:
> git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace.git
> That tag was created by Steven Rostedt so the arm64 tree can pull the prior work
> this depends on. [1]
>
> Thanks to the ftrace refactoring under that tag, an ftrace_ops backing a ftrace
> direct call will only ever point to *one* direct call. This means we can look up
> the direct called trampoline address stored in the ops from the ftrace_caller
> trampoline in the case when the destination would be out of reach of a BL
> instruction at the ftrace callsite. This fixes limitations of previous attempts
> such as [2].
>
> This series has been tested on arm64 with:
> 1- CONFIG_FTRACE_SELFTEST
> 2- samples/ftrace/*.ko (cf: patch 3)
> 3- tools/testing/selftests/bpf/test_progs (cf: patch 4)
>
> Changes since v3 [3]:
> - Added "BTI C" instructions at the beginning of each ftrace direct call sample

Ugh, I am an idiot (let's just blame Mondays!) and didn't actually
amend this change, I'm sending the series again as a v5 and this time
with the change actually folded in... Please ignore this v4, sorry for
the noise! :|