Hi again!
V4 should include all your requested changes. Since only Julien
commented "OK" on the reliable stacktrace part, I finished it on my
own. This set now passes the relevant tests in Libor's test suite, so
livepatching the kernel proper does work.
Remember to apply Jessica's addendum in order to livepatch functions
that live in modules.
[Changes from v3]:
* Compiler support for -fpatchable-function-entry now automagically
selects _WITH_REGS when DYNAMIC_FTRACE is switched on. Consequently,
CONFIG_DYNAMIC_FTRACE_WITH_REGS is the only preprocessor symbol
set by this feature (as asked for by Takahiro in v2)
* The dynamic ftrace caller creates 2 stack frames, as suggested by Ard:
first a "preliminary" for the callee, and another for ftrace_caller
itself. This gives the stack layout really a clean look.
* Because the ftrace-clobbered x9 is now saved immediately in the
"callee" frame, it can be used to base pt_regs access. Much prettier now.
* Dynamic replacement insn "mov x9, lr" is generated using the common
framework; a hopefully meaningful macro name is used for abbreviation.
* The use_ftrace_trampoline() helper introduced in v3 got renamed
and streamlined with a reference variable, both as pointed out by Mark.
* Superflous barriers during trace application removed.
* #ifdef replaced by IS_ENABLED() where possible.
* Made stuff compile with gcc7 or older, too ;-)
* Fix my misguided .text.ftrace_regs_trampoline section assumption.
the second trampoline goes into .text.ftrace_trampoline as well.
* Properly detect the bottom of kthread stacks, by setting a global
symbol to the address where their LR points to and compare against it.
* Rewrote many comments to hopefully clear things up.
[Changes from v2]:
* ifeq($(CONFIG_DYNAMIC_FTRACE_WITH_REGS),y) instead of ifdef
* "fix" commit 06aeaaeabf69da4. (new patch 1)
Made DYNAMIC_FTRACE_WITH_REGS a real choice. The current situation
would be that a linux-4.20 kernel on arm64 should be built with
gcc >= 8; as in this case, as well as all other archs, the "default y"
works. Only kernels >= 4.20, arm64, gcc < 8, must change this to "n"
in order to not be stopped by the Makefile $(error) from patch 2/4.
You'll then fall back to the DYNAMIC_FTRACE, if selected, like before.
* use some S_X* constants to refer to offsets into pt_regs in assembly.
* have the compiler/assembler generate the mov x9,x30 instruction that
saves LR at compile time, rather than generate it repeatedly at runtime.
* flip the ftrace_regs_caller stack frame so that it is no longer
upside down, as Ard remarked. This change broke the graph caller somehow.
* extend handling of the module arch-dependent ftrace trampoline with
a companion "regs" version.
* clear the _TIF_PATCH_PENDING on do_notify_resume()
* took care of arch/arm64/kernel/time.c when changing stack unwinder
semantics
[Changes from v1]:
* Missing compiler support is now a Makefile error, instead
of a warning. This will keep the compile log shorter and
it will thus be easier to spot the problem.
* A separate ftrace_regs_caller. Only that one will write out
a complete pt_regs, for efficiency.
* Replace the use of X19 with X28 to remember the old PC during
live patch detection, as only that is saved&restored now for
non-regs ftrace.
* CONFIG_DYNAMIC_FTRACE_WITH_REGS and CONFIG_DYNAMIC_FTRACE_WITH_REGS
are currently synonymous on arm64, but differentiate better for
the future when this is no longer the case.
* Clean up "old"/"new" insn value setting vs. #ifdefs.
* #define a INSN_MOV_X9_X30 with suggested aarch64_insn_gen call
and use that instead of an immediate hex value.
Torsten
Use -fpatchable-function-entry (gcc8) to add 2 NOPs at the beginning
of each function. Replace the first NOP thus generated with a quick LR
saver (move it to scratch reg x9), so the 2nd replacement insn, the call
to ftrace, does not clobber the value. Ftrace will then generate the
standard stack frames.
Note that patchable-function-entry in GCC disables IPA-RA, which means
ABI register calling conventions are obeyed *and* scratch registers
such as x9 are available.
Introduce and handle an ftrace_regs_trampoline for module PLTs, right
after ftrace_trampoline, and double the size of this special section
if .text.ftrace_trampoline is present in the module.
Signed-off-by: Torsten Duwe <[email protected]>
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -110,6 +110,8 @@ config ARM64
select HAVE_DEBUG_KMEMLEAK
select HAVE_DMA_CONTIGUOUS
select HAVE_DYNAMIC_FTRACE
+ select HAVE_DYNAMIC_FTRACE_WITH_REGS \
+ if $(cc-option,-fpatchable-function-entry=2)
select HAVE_EFFICIENT_UNALIGNED_ACCESS
select HAVE_FTRACE_MCOUNT_RECORD
select HAVE_FUNCTION_TRACER
--- a/arch/arm64/Makefile
+++ b/arch/arm64/Makefile
@@ -78,6 +78,10 @@ ifeq ($(CONFIG_ARM64_MODULE_PLTS),y)
KBUILD_LDFLAGS_MODULE += -T $(srctree)/arch/arm64/kernel/module.lds
endif
+ifeq ($(CONFIG_DYNAMIC_FTRACE_WITH_REGS),y)
+ CC_FLAGS_FTRACE := -fpatchable-function-entry=2
+endif
+
# Default value
head-y := arch/arm64/kernel/head.o
--- a/arch/arm64/include/asm/ftrace.h
+++ b/arch/arm64/include/asm/ftrace.h
@@ -16,6 +16,19 @@
#define MCOUNT_ADDR ((unsigned long)_mcount)
#define MCOUNT_INSN_SIZE AARCH64_INSN_SIZE
+/*
+ * DYNAMIC_FTRACE_WITH_REGS is implemented by adding 2 NOPs at the beginning
+ * of each function, with the second NOP actually calling ftrace. In contrary
+ * to a classic _mcount call, the call instruction to be modified is thus
+ * the second one, and not the only one.
+ */
+#ifdef CONFIG_DYNAMIC_FTRACE_WITH_REGS
+#define ARCH_SUPPORTS_FTRACE_OPS 1
+#define REC_IP_BRANCH_OFFSET AARCH64_INSN_SIZE
+#else
+#define REC_IP_BRANCH_OFFSET 0
+#endif
+
#ifndef __ASSEMBLY__
#include <linux/compat.h>
--- a/arch/arm64/kernel/Makefile
+++ b/arch/arm64/kernel/Makefile
@@ -7,9 +7,9 @@ CPPFLAGS_vmlinux.lds := -DTEXT_OFFSET=$(
AFLAGS_head.o := -DTEXT_OFFSET=$(TEXT_OFFSET)
CFLAGS_armv8_deprecated.o := -I$(src)
-CFLAGS_REMOVE_ftrace.o = -pg
-CFLAGS_REMOVE_insn.o = -pg
-CFLAGS_REMOVE_return_address.o = -pg
+CFLAGS_REMOVE_ftrace.o = -pg $(CC_FLAGS_FTRACE)
+CFLAGS_REMOVE_insn.o = -pg $(CC_FLAGS_FTRACE)
+CFLAGS_REMOVE_return_address.o = -pg $(CC_FLAGS_FTRACE)
# Object file lists.
arm64-obj-y := debug-monitors.o entry.o irq.o fpsimd.o \
--- a/drivers/firmware/efi/libstub/Makefile
+++ b/drivers/firmware/efi/libstub/Makefile
@@ -11,7 +11,8 @@ cflags-$(CONFIG_X86) += -m$(BITS) -D__K
-fPIC -fno-strict-aliasing -mno-red-zone \
-mno-mmx -mno-sse -fshort-wchar
-cflags-$(CONFIG_ARM64) := $(subst -pg,,$(KBUILD_CFLAGS)) -fpie
+cflags-$(CONFIG_ARM64) := $(filter-out -pg $(CC_FLAGS_FTRACE)\
+ ,$(KBUILD_CFLAGS)) -fpie
cflags-$(CONFIG_ARM) := $(subst -pg,,$(KBUILD_CFLAGS)) \
-fno-builtin -fpic -mno-single-pic-base
--- a/arch/arm64/kernel/entry-ftrace.S
+++ b/arch/arm64/kernel/entry-ftrace.S
@@ -13,6 +13,8 @@
#include <asm/assembler.h>
#include <asm/ftrace.h>
#include <asm/insn.h>
+#include <asm/asm-offsets.h>
+#include <asm/assembler.h>
/*
* Gcc with -pg will put the following code in the beginning of each function:
@@ -123,6 +125,7 @@ skip_ftrace_call: // }
ENDPROC(_mcount)
#else /* CONFIG_DYNAMIC_FTRACE */
+#ifndef CONFIG_DYNAMIC_FTRACE_WITH_REGS
/*
* _mcount() is used to build the kernel with -pg option, but all the branch
* instructions to _mcount() are replaced to NOP initially at kernel start up,
@@ -162,6 +165,114 @@ ftrace_graph_call: // ftrace_graph_cal
mcount_exit
ENDPROC(ftrace_caller)
+#else /* CONFIG_DYNAMIC_FTRACE_WITH_REGS */
+
+/*
+ * Since no -pg or similar compiler flag is used, there should really be
+ * no reference to _mcount; so do not define one. Only some value for
+ * MCOUNT_ADDR is needed for comparison. Let it point here to have some
+ * sort of magic value that can be recognised when debugging.
+ */
+ .global _mcount
+_mcount:
+ ret /* make it differ from regs caller */
+
+ENTRY(ftrace_regs_caller)
+ /* callee's preliminary stack frame: */
+ stp fp, x9, [sp, #-16]!
+ mov fp, sp
+
+ /* our stack frame: */
+ stp fp, lr, [sp, #-S_FRAME_SIZE]!
+ add x9, sp, #16 /* offset to pt_regs */
+
+ stp x10, x11, [x9, #S_X10]
+ stp x12, x13, [x9, #S_X12]
+ stp x14, x15, [x9, #S_X14]
+ stp x16, x17, [x9, #S_X16]
+ stp x18, x19, [x9, #S_X18]
+ stp x20, x21, [x9, #S_X20]
+ stp x22, x23, [x9, #S_X22]
+ stp x24, x25, [x9, #S_X24]
+ stp x26, x27, [x9, #S_X26]
+
+ b ftrace_common
+ENDPROC(ftrace_regs_caller)
+
+ENTRY(ftrace_caller)
+ /* callee's preliminary stack frame: */
+ stp fp, x9, [sp, #-16]!
+ mov fp, sp
+
+ /* our stack frame: */
+ stp fp, lr, [sp, #-S_FRAME_SIZE]!
+ add x9, sp, #16 /* offset to pt_regs */
+
+ftrace_common:
+ /*
+ * At this point we have 2 new stack frames, and x9 pointing
+ * at a pt_regs which we can populate as needed.
+ */
+
+ /* save function arguments */
+ stp x0, x1, [x9]
+ stp x2, x3, [x9, #S_X2]
+ stp x4, x5, [x9, #S_X4]
+ stp x6, x7, [x9, #S_X6]
+ stp x8, x9, [x9, #S_X8]
+
+ ldr x0, [fp]
+ stp x28, x0, [x9, #S_X28] /* FP in pt_regs + "our" x28 */
+
+ /* The program counter just after the ftrace call site */
+ str lr, [x9, #S_PC]
+ /* The stack pointer as it was on ftrace_caller entry... */
+ add x28, fp, #16
+ str x28, [x9, #S_SP]
+ /* The link Register at callee entry */
+ ldr x28, [fp, 8]
+ str x28, [x9, #S_LR] /* to pt_regs.r[30] */
+
+ ldr_l x2, function_trace_op, x0
+ ldr x1, [fp, #8]
+ sub x0, lr, #8 /* function entry == IP */
+ mov x3, x9 /* pt_regs are @x9 */
+
+ mov fp, sp
+
+ .global ftrace_call
+ftrace_call:
+
+ bl ftrace_stub
+
+#ifdef CONFIG_FUNCTION_GRAPH_TRACER
+ .global ftrace_graph_call
+ftrace_graph_call: // ftrace_graph_caller();
+ nop // If enabled, this will be replaced
+ // "b ftrace_graph_caller"
+#endif
+
+ftrace_common_return:
+ add x9, sp, #16 /* advance to pt_regs for restore */
+
+ ldp x0, x1, [x9]
+ ldp x2, x3, [x9, #S_X2]
+ ldp x4, x5, [x9, #S_X4]
+ ldp x6, x7, [x9, #S_X6]
+ ldp x8, x9, [x9, #S_X8]
+
+ ldp x28, fp, [x9, #S_X28]
+
+ ldr lr, [x9, #S_LR]
+ ldr x9, [x9, #S_PC]
+ /* clean up both frames, ours and callee preliminary */
+ add sp, sp, #S_FRAME_SIZE + 16
+
+ ret x9
+
+ENDPROC(ftrace_caller)
+
+#endif /* CONFIG_DYNAMIC_FTRACE_WITH_REGS */
#endif /* CONFIG_DYNAMIC_FTRACE */
ENTRY(ftrace_stub)
@@ -197,12 +308,21 @@ ENDPROC(ftrace_stub)
* and run return_to_handler() later on its exit.
*/
ENTRY(ftrace_graph_caller)
+#ifndef CONFIG_DYNAMIC_FTRACE_WITH_REGS
mcount_get_lr_addr x0 // pointer to function's saved lr
mcount_get_pc x1 // function's pc
mcount_get_parent_fp x2 // parent's fp
bl prepare_ftrace_return // prepare_ftrace_return(&lr, pc, fp)
mcount_exit
+#else
+ add x9, sp, #16 /* advance to pt_regs to gather args */
+ add x0, x9, #S_LR /* &lr */
+ ldr x1, [x9, #S_PC] /* pc */
+ ldr x2, [x9, #S_STACKFRAME] /* fp */
+ bl prepare_ftrace_return
+ b ftrace_common_return
+#endif
ENDPROC(ftrace_graph_caller)
/*
--- a/arch/arm64/kernel/ftrace.c
+++ b/arch/arm64/kernel/ftrace.c
@@ -65,18 +65,61 @@ int ftrace_update_ftrace_func(ftrace_fun
return ftrace_modify_code(pc, 0, new, false);
}
+#ifdef CONFIG_ARM64_MODULE_PLTS
+static int install_ftrace_trampoline(struct module *mod, unsigned long *addr)
+{
+ struct plt_entry trampoline, *mod_trampoline;
+ trampoline = get_plt_entry(*addr);
+
+ if (*addr == FTRACE_ADDR)
+ mod_trampoline = mod->arch.ftrace_trampoline;
+ else if (*addr == FTRACE_REGS_ADDR)
+ mod_trampoline = mod->arch.ftrace_regs_trampoline;
+ else
+ return -EINVAL;
+
+ if (!plt_entries_equal(mod_trampoline, &trampoline)) {
+
+ /* point the trampoline to our ftrace entry point */
+ module_disable_ro(mod);
+ *mod_trampoline = trampoline;
+ module_enable_ro(mod, true);
+
+ /* update trampoline before patching in the branch */
+ smp_wmb();
+ }
+ *addr = (unsigned long)(void *)mod_trampoline;
+
+ return 0;
+}
+#endif
+
+/*
+ * Ftrace with regs generates the tracer calls as close as possible to
+ * the function entry; no stack frame has been set up at that point.
+ * In order to make another call e.g to ftrace_caller, the LR must be
+ * saved from being overwritten.
+ * Between two functions, and with IPA-RA turned off, the scratch registers
+ * are available, so move the LR to x9 before calling into ftrace.
+ * "mov x9, lr" is officially aliased from "orr x9, xzr, lr".
+ */
+#define QUICK_LR_SAVE aarch64_insn_gen_logical_shifted_reg( \
+ AARCH64_INSN_REG_9, AARCH64_INSN_REG_ZR, \
+ AARCH64_INSN_REG_LR, 0, AARCH64_INSN_VARIANT_64BIT, \
+ AARCH64_INSN_LOGIC_ORR)
+
/*
* Turn on the call to ftrace_caller() in instrumented function
*/
int ftrace_make_call(struct dyn_ftrace *rec, unsigned long addr)
{
- unsigned long pc = rec->ip;
+ unsigned long pc = rec->ip + REC_IP_BRANCH_OFFSET;
+ int ret;
u32 old, new;
long offset = (long)pc - (long)addr;
if (offset < -SZ_128M || offset >= SZ_128M) {
#ifdef CONFIG_ARM64_MODULE_PLTS
- struct plt_entry trampoline;
struct module *mod;
/*
@@ -96,54 +139,65 @@ int ftrace_make_call(struct dyn_ftrace *
if (WARN_ON(!mod))
return -EINVAL;
- /*
- * There is only one ftrace trampoline per module. For now,
- * this is not a problem since on arm64, all dynamic ftrace
- * invocations are routed via ftrace_caller(). This will need
- * to be revisited if support for multiple ftrace entry points
- * is added in the future, but for now, the pr_err() below
- * deals with a theoretical issue only.
- */
- trampoline = get_plt_entry(addr);
- if (!plt_entries_equal(mod->arch.ftrace_trampoline,
- &trampoline)) {
- if (!plt_entries_equal(mod->arch.ftrace_trampoline,
- &(struct plt_entry){})) {
- pr_err("ftrace: far branches to multiple entry points unsupported inside a single module\n");
- return -EINVAL;
- }
-
- /* point the trampoline to our ftrace entry point */
- module_disable_ro(mod);
- *mod->arch.ftrace_trampoline = trampoline;
- module_enable_ro(mod, true);
-
- /* update trampoline before patching in the branch */
- smp_wmb();
+ /* Check against our well-known list of ftrace entry points */
+ if (addr == FTRACE_ADDR || addr == FTRACE_REGS_ADDR) {
+ ret = install_ftrace_trampoline(mod, &addr);
+ if (ret < 0)
+ return ret;
}
- addr = (unsigned long)(void *)mod->arch.ftrace_trampoline;
+ else
+ return -EINVAL;
+
#else /* CONFIG_ARM64_MODULE_PLTS */
return -EINVAL;
#endif /* CONFIG_ARM64_MODULE_PLTS */
}
old = aarch64_insn_gen_nop();
+ if (IS_ENABLED(CONFIG_DYNAMIC_FTRACE_WITH_REGS)) {
+ new = QUICK_LR_SAVE;
+ ret = ftrace_modify_code(pc - AARCH64_INSN_SIZE,
+ old, new, true);
+ if (ret)
+ return ret;
+ }
new = aarch64_insn_gen_branch_imm(pc, addr, AARCH64_INSN_BRANCH_LINK);
return ftrace_modify_code(pc, old, new, true);
}
+#ifdef CONFIG_DYNAMIC_FTRACE_WITH_REGS
+int ftrace_modify_call(struct dyn_ftrace *rec, unsigned long old_addr,
+ unsigned long addr)
+{
+ unsigned long pc = rec->ip + REC_IP_BRANCH_OFFSET;
+ u32 old, new;
+
+ old = aarch64_insn_gen_branch_imm(pc, old_addr, true);
+ new = aarch64_insn_gen_branch_imm(pc, addr, true);
+
+ return ftrace_modify_code(pc, old, new, true);
+}
+#endif
+
/*
* Turn off the call to ftrace_caller() in instrumented function
*/
int ftrace_make_nop(struct module *mod, struct dyn_ftrace *rec,
unsigned long addr)
{
- unsigned long pc = rec->ip;
+ unsigned long pc = rec->ip + REC_IP_BRANCH_OFFSET;
bool validate = true;
+ int ret;
u32 old = 0, new;
long offset = (long)pc - (long)addr;
+ /* -fpatchable-function-entry= does not generate a profiling call
+ * initially; the NOPs are already there.
+ */
+ if (IS_ENABLED(CONFIG_DYNAMIC_FTRACE_WITH_REGS) && addr == MCOUNT_ADDR)
+ return 0;
+
if (offset < -SZ_128M || offset >= SZ_128M) {
#ifdef CONFIG_ARM64_MODULE_PLTS
u32 replaced;
@@ -188,7 +242,15 @@ int ftrace_make_nop(struct module *mod,
new = aarch64_insn_gen_nop();
- return ftrace_modify_code(pc, old, new, validate);
+ ret = ftrace_modify_code(pc, old, new, validate);
+ if (ret)
+ return ret;
+ if (IS_ENABLED(CONFIG_DYNAMIC_FTRACE_WITH_REGS)) {
+ old = QUICK_LR_SAVE;
+ ret = ftrace_modify_code(pc - AARCH64_INSN_SIZE,
+ old, new, true);
+ }
+ return ret;
}
void arch_ftrace_update_code(int command)
--- a/include/asm-generic/vmlinux.lds.h
+++ b/include/asm-generic/vmlinux.lds.h
@@ -115,6 +115,7 @@
#define MCOUNT_REC() . = ALIGN(8); \
__start_mcount_loc = .; \
KEEP(*(__mcount_loc)) \
+ KEEP(*(__patchable_function_entries)) \
__stop_mcount_loc = .;
#else
#define MCOUNT_REC()
--- a/include/linux/compiler_types.h
+++ b/include/linux/compiler_types.h
@@ -61,8 +61,12 @@ extern void __chk_io_ptr(const volatile
#if defined(CC_USING_HOTPATCH) && !defined(__CHECKER__)
#define notrace __attribute__((hotpatch(0,0)))
#else
+#if defined(CONFIG_ARM64) && defined(CONFIG_DYNAMIC_FTRACE_WITH_REGS)
+#define notrace __attribute__((patchable_function_entry(0)))
+#else
#define notrace __attribute__((no_instrument_function))
#endif
+#endif
/* Intel compiler defines __GNUC__. So we will overwrite implementations
* coming from above header files here
--- a/arch/arm64/include/asm/module.h
+++ b/arch/arm64/include/asm/module.h
@@ -33,6 +33,7 @@ struct mod_arch_specific {
/* for CONFIG_DYNAMIC_FTRACE */
struct plt_entry *ftrace_trampoline;
+ struct plt_entry *ftrace_regs_trampoline;
};
#endif
--- a/arch/arm64/kernel/module.c
+++ b/arch/arm64/kernel/module.c
@@ -452,8 +452,11 @@ int module_finalize(const Elf_Ehdr *hdr,
apply_alternatives_module((void *)s->sh_addr, s->sh_size);
#ifdef CONFIG_ARM64_MODULE_PLTS
if (IS_ENABLED(CONFIG_DYNAMIC_FTRACE) &&
- !strcmp(".text.ftrace_trampoline", secstrs + s->sh_name))
+ !strcmp(".text.ftrace_trampoline", secstrs + s->sh_name)) {
me->arch.ftrace_trampoline = (void *)s->sh_addr;
+ me->arch.ftrace_regs_trampoline =
+ (void *)(s->sh_addr + sizeof(struct plt_entry));
+ }
#endif
}
--- a/arch/arm64/kernel/module-plts.c
+++ b/arch/arm64/kernel/module-plts.c
@@ -272,7 +272,7 @@ int module_frob_arch_sections(Elf_Ehdr *
tramp->sh_type = SHT_NOBITS;
tramp->sh_flags = SHF_EXECINSTR | SHF_ALLOC;
tramp->sh_addralign = __alignof__(struct plt_entry);
- tramp->sh_size = sizeof(struct plt_entry);
+ tramp->sh_size = 2 * sizeof(struct plt_entry);
}
return 0;
Based on ftrace with regs, do the usual thing.
(see Documentation/livepatch/livepatch.txt)
Use task flag bit 6 to track patch transisiton state for the consistency
model. Add it to the work mask so it gets cleared on all kernel exits to
userland.
Tell livepatch regs->pc is the place to change the return address.
Make sure the graph tracer call hook is only called on the final function
entry in case regs->pc gets modified after an interception.
Signed-off-by: Torsten Duwe <[email protected]>
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -120,6 +120,7 @@ config ARM64
select HAVE_GENERIC_DMA_COHERENT
select HAVE_HW_BREAKPOINT if PERF_EVENTS
select HAVE_IRQ_TIME_ACCOUNTING
+ select HAVE_LIVEPATCH
select HAVE_MEMBLOCK
select HAVE_MEMBLOCK_NODE_MAP if NUMA
select HAVE_NMI
@@ -1350,4 +1351,6 @@ if CRYPTO
source "arch/arm64/crypto/Kconfig"
endif
+source "kernel/livepatch/Kconfig"
+
source "lib/Kconfig"
--- a/arch/arm64/include/asm/thread_info.h
+++ b/arch/arm64/include/asm/thread_info.h
@@ -76,6 +76,7 @@ void arch_release_task_struct(struct tas
#define TIF_FOREIGN_FPSTATE 3 /* CPU's FP state is not current's */
#define TIF_UPROBE 4 /* uprobe breakpoint or singlestep */
#define TIF_FSCHECK 5 /* Check FS is USER_DS on return */
+#define TIF_PATCH_PENDING 6
#define TIF_NOHZ 7
#define TIF_SYSCALL_TRACE 8
#define TIF_SYSCALL_AUDIT 9
@@ -94,6 +95,7 @@ void arch_release_task_struct(struct tas
#define _TIF_NEED_RESCHED (1 << TIF_NEED_RESCHED)
#define _TIF_NOTIFY_RESUME (1 << TIF_NOTIFY_RESUME)
#define _TIF_FOREIGN_FPSTATE (1 << TIF_FOREIGN_FPSTATE)
+#define _TIF_PATCH_PENDING (1 << TIF_PATCH_PENDING)
#define _TIF_NOHZ (1 << TIF_NOHZ)
#define _TIF_SYSCALL_TRACE (1 << TIF_SYSCALL_TRACE)
#define _TIF_SYSCALL_AUDIT (1 << TIF_SYSCALL_AUDIT)
@@ -106,7 +108,8 @@ void arch_release_task_struct(struct tas
#define _TIF_WORK_MASK (_TIF_NEED_RESCHED | _TIF_SIGPENDING | \
_TIF_NOTIFY_RESUME | _TIF_FOREIGN_FPSTATE | \
- _TIF_UPROBE | _TIF_FSCHECK)
+ _TIF_UPROBE | _TIF_FSCHECK | \
+ _TIF_PATCH_PENDING)
#define _TIF_SYSCALL_WORK (_TIF_SYSCALL_TRACE | _TIF_SYSCALL_AUDIT | \
_TIF_SYSCALL_TRACEPOINT | _TIF_SECCOMP | \
--- /dev/null
+++ b/arch/arm64/include/asm/livepatch.h
@@ -0,0 +1,35 @@
+/* SPDX-License-Identifier: GPL-2.0
+ *
+ * livepatch.h - arm64-specific Kernel Live Patching Core
+ *
+ * Copyright (C) 2016,2018 SUSE
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version 2
+ * of the License, or (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+#ifndef _ASM_ARM64_LIVEPATCH_H
+#define _ASM_ARM64_LIVEPATCH_H
+
+#include <asm/ptrace.h>
+
+static inline int klp_check_compiler_support(void)
+{
+ return 0;
+}
+
+static inline void klp_arch_set_pc(struct pt_regs *regs, unsigned long ip)
+{
+ regs->pc = ip;
+}
+
+#endif /* _ASM_ARM64_LIVEPATCH_H */
--- a/arch/arm64/kernel/entry-ftrace.S
+++ b/arch/arm64/kernel/entry-ftrace.S
@@ -226,6 +226,7 @@ ftrace_common:
/* The program counter just after the ftrace call site */
str lr, [x9, #S_PC]
+
/* The stack pointer as it was on ftrace_caller entry... */
add x28, fp, #16
str x28, [x9, #S_SP]
@@ -233,6 +234,10 @@ ftrace_common:
ldr x28, [fp, 8]
str x28, [x9, #S_LR] /* to pt_regs.r[30] */
+#if defined(CONFIG_LIVEPATCH) && defined(CONFIG_FUNCTION_GRAPH_TRACER)
+ mov x28, lr /* remember old return address */
+#endif
+
ldr_l x2, function_trace_op, x0
ldr x1, [fp, #8]
sub x0, lr, #8 /* function entry == IP */
@@ -245,6 +250,17 @@ ftrace_call:
bl ftrace_stub
+#if defined(CONFIG_LIVEPATCH) && defined(CONFIG_FUNCTION_GRAPH_TRACER)
+ /* Is the trace function a live patcher an has messed with
+ * the return address?
+ */
+ add x9, sp, #16 /* advance to pt_regs for restore */
+ ldr x0, [x9, #S_PC]
+ cmp x0, x28 /* compare with the value we remembered */
+ /* to not call graph tracer's "call" mechanism twice! */
+ b.ne ftrace_common_return
+#endif
+
#ifdef CONFIG_FUNCTION_GRAPH_TRACER
.global ftrace_graph_call
ftrace_graph_call: // ftrace_graph_caller();
--- a/arch/arm64/kernel/signal.c
+++ b/arch/arm64/kernel/signal.c
@@ -29,6 +29,7 @@
#include <linux/sizes.h>
#include <linux/string.h>
#include <linux/tracehook.h>
+#include <linux/livepatch.h>
#include <linux/ratelimit.h>
#include <linux/syscalls.h>
@@ -934,6 +935,9 @@ asmlinkage void do_notify_resume(struct
if (thread_flags & _TIF_UPROBE)
uprobe_notify_resume(regs);
+ if (thread_flags & _TIF_PATCH_PENDING)
+ klp_update_patch_state(current);
+
if (thread_flags & _TIF_SIGPENDING)
do_signal(regs);
Enhance the stack unwinder so that it reports whether it had to stop
normally or due to an error condition; unwind_frame() will report
continue/error/normal ending and walk_stackframe() will pass that
info. __save_stack_trace() is used to check the validity of a stack;
save_stack_trace_tsk_reliable() can now trivially be implemented.
Modify arch/arm64/kernel/time.c as the only external caller so far
to recognise the new semantics.
I had to introduce a marker symbol kthread_return_to_user to tell
the normal origin of a kernel thread.
Signed-off-by: Torsten Duwe <[email protected]>
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -128,8 +128,9 @@ config ARM64
select HAVE_PERF_EVENTS
select HAVE_PERF_REGS
select HAVE_PERF_USER_STACK_DUMP
- select HAVE_REGS_AND_STACK_ACCESS_API
select HAVE_RCU_TABLE_FREE
+ select HAVE_REGS_AND_STACK_ACCESS_API
+ select HAVE_RELIABLE_STACKTRACE
select HAVE_STACKPROTECTOR
select HAVE_SYSCALL_TRACEPOINTS
select HAVE_KPROBES
--- a/arch/arm64/include/asm/stacktrace.h
+++ b/arch/arm64/include/asm/stacktrace.h
@@ -33,7 +33,7 @@ struct stackframe {
};
extern int unwind_frame(struct task_struct *tsk, struct stackframe *frame);
-extern void walk_stackframe(struct task_struct *tsk, struct stackframe *frame,
+extern int walk_stackframe(struct task_struct *tsk, struct stackframe *frame,
int (*fn)(struct stackframe *, void *), void *data);
extern void dump_backtrace(struct pt_regs *regs, struct task_struct *tsk);
--- a/arch/arm64/kernel/stacktrace.c
+++ b/arch/arm64/kernel/stacktrace.c
@@ -40,6 +40,16 @@
* ldp x29, x30, [sp]
* add sp, sp, #0x10
*/
+
+/* The bottom of kernel thread stacks points there */
+extern void *kthread_return_to_user;
+
+/*
+ * unwind_frame -- unwind a single stack frame.
+ * Returns 0 when there are more frames to go.
+ * 1 means reached end of stack; negative (error)
+ * means stopped because information is not reliable.
+ */
int notrace unwind_frame(struct task_struct *tsk, struct stackframe *frame)
{
unsigned long fp = frame->fp;
@@ -75,29 +85,39 @@ int notrace unwind_frame(struct task_str
#endif /* CONFIG_FUNCTION_GRAPH_TRACER */
/*
+ * kthreads created via copy_thread() (called from kthread_create())
+ * will have a zero BP and a return value into ret_from_fork.
+ */
+ if (!frame->fp && frame->pc == (unsigned long)&kthread_return_to_user)
+ return 1;
+ /*
* Frames created upon entry from EL0 have NULL FP and PC values, so
* don't bother reporting these. Frames created by __noreturn functions
* might have a valid FP even if PC is bogus, so only terminate where
* both are NULL.
*/
if (!frame->fp && !frame->pc)
- return -EINVAL;
+ return 1;
return 0;
}
-void notrace walk_stackframe(struct task_struct *tsk, struct stackframe *frame,
+int notrace walk_stackframe(struct task_struct *tsk, struct stackframe *frame,
int (*fn)(struct stackframe *, void *), void *data)
{
while (1) {
int ret;
- if (fn(frame, data))
- break;
+ ret = fn(frame, data);
+ if (ret)
+ return ret;
ret = unwind_frame(tsk, frame);
if (ret < 0)
+ return ret;
+ if (ret > 0)
break;
}
+ return 0;
}
#ifdef CONFIG_STACKTRACE
@@ -145,14 +165,15 @@ void save_stack_trace_regs(struct pt_reg
trace->entries[trace->nr_entries++] = ULONG_MAX;
}
-static noinline void __save_stack_trace(struct task_struct *tsk,
+static noinline int __save_stack_trace(struct task_struct *tsk,
struct stack_trace *trace, unsigned int nosched)
{
struct stack_trace_data data;
struct stackframe frame;
+ int ret;
if (!try_get_task_stack(tsk))
- return;
+ return -EBUSY;
data.trace = trace;
data.skip = trace->skip;
@@ -171,11 +192,12 @@ static noinline void __save_stack_trace(
frame.graph = tsk->curr_ret_stack;
#endif
- walk_stackframe(tsk, &frame, save_trace, &data);
+ ret = walk_stackframe(tsk, &frame, save_trace, &data);
if (trace->nr_entries < trace->max_entries)
trace->entries[trace->nr_entries++] = ULONG_MAX;
put_task_stack(tsk);
+ return ret;
}
EXPORT_SYMBOL_GPL(save_stack_trace_tsk);
@@ -190,4 +212,12 @@ void save_stack_trace(struct stack_trace
}
EXPORT_SYMBOL_GPL(save_stack_trace);
+
+int save_stack_trace_tsk_reliable(struct task_struct *tsk,
+ struct stack_trace *trace)
+{
+ return __save_stack_trace(tsk, trace, 1);
+}
+EXPORT_SYMBOL_GPL(save_stack_trace_tsk_reliable);
+
#endif
--- a/arch/arm64/kernel/time.c
+++ b/arch/arm64/kernel/time.c
@@ -56,7 +56,7 @@ unsigned long profile_pc(struct pt_regs
#endif
do {
int ret = unwind_frame(NULL, &frame);
- if (ret < 0)
+ if (ret)
return 0;
} while (in_lock_functions(frame.pc));
--- a/arch/arm64/kernel/entry.S
+++ b/arch/arm64/kernel/entry.S
@@ -1178,15 +1178,17 @@ ENTRY(cpu_switch_to)
ENDPROC(cpu_switch_to)
NOKPROBE(cpu_switch_to)
+ .global kthread_return_to_user
/*
* This is how we return from a fork.
*/
ENTRY(ret_from_fork)
bl schedule_tail
- cbz x19, 1f // not a kernel thread
+ cbz x19, kthread_return_to_user // not a kernel thread
mov x0, x20
blr x19
-1: get_thread_info tsk
+kthread_return_to_user:
+ get_thread_info tsk
b ret_to_user
ENDPROC(ret_from_fork)
NOKPROBE(ret_from_fork)
On Fri, Oct 26, 2018 at 04:21:57PM +0200, Torsten Duwe wrote:
> Enhance the stack unwinder so that it reports whether it had to stop
> normally or due to an error condition; unwind_frame() will report
> continue/error/normal ending and walk_stackframe() will pass that
> info. __save_stack_trace() is used to check the validity of a stack;
> save_stack_trace_tsk_reliable() can now trivially be implemented.
> Modify arch/arm64/kernel/time.c as the only external caller so far
> to recognise the new semantics.
>
> I had to introduce a marker symbol kthread_return_to_user to tell
> the normal origin of a kernel thread.
>
> Signed-off-by: Torsten Duwe <[email protected]>
I haven't looked at the code, but the commit log doesn't inspire much
confidence. It's missing everything I previously asked for in the
powerpc version.
There's zero mention of objtool. What analysis was done to indicate
that we can rely on frame pointers?
Such a frame pointer analysis should be included in the commit log. It
should describe *at least* the following:
- whether inline asm statements with call/branch instructions will
confuse GCC into skipping the frame pointer setup if it considers the
function to be a leaf function;
- whether hand-coded non-leaf assembly functions can accidentally omit
the frame pointer prologue setup;
- whether GCC can generally be relied upon to get arm64 frame pointers
right, in both normal operation and edge cases.
The commit log should also describe whether the unwinder itself can be
considered reliable for all edge cases:
- detection and reporting of preemption and page faults;
- detection and recovery from function graph tracing;
- detection and reporting of other unexpected conditions,
including when the unwinder doesn't reach the end of the stack.
--
Josh
Hi Josh,
I also have a few concerns here, as it is not clear to me precisely what is
required from arch code. Is there any documentation I should look at?
On Fri, Oct 26, 2018 at 10:37:04AM -0500, Josh Poimboeuf wrote:
> On Fri, Oct 26, 2018 at 04:21:57PM +0200, Torsten Duwe wrote:
> > Enhance the stack unwinder so that it reports whether it had to stop
> > normally or due to an error condition; unwind_frame() will report
> > continue/error/normal ending and walk_stackframe() will pass that
> > info. __save_stack_trace() is used to check the validity of a stack;
> > save_stack_trace_tsk_reliable() can now trivially be implemented.
> > Modify arch/arm64/kernel/time.c as the only external caller so far
> > to recognise the new semantics.
There are a number of error conditions not currently handled by the unwinder
(mostly in the face of stack corruption), for which there have been prior
discussions on list.
Do we care about those cases, or do we consider things best-effort in the face
of stack corruption?
> > I had to introduce a marker symbol kthread_return_to_user to tell
> > the normal origin of a kernel thread.
> >
> > Signed-off-by: Torsten Duwe <[email protected]>
>
> I haven't looked at the code, but the commit log doesn't inspire much
> confidence. It's missing everything I previously asked for in the
> powerpc version.
>
> There's zero mention of objtool. What analysis was done to indicate
> that we can rely on frame pointers?
>
> Such a frame pointer analysis should be included in the commit log. It
> should describe *at least* the following:
>
> - whether inline asm statements with call/branch instructions will
> confuse GCC into skipping the frame pointer setup if it considers the
> function to be a leaf function;
There's a reasonable chance that the out-of-line LL/SC atomics could confuse
GCC into thinking callers are leaf functions. That's the only inline asm that
I'm aware of with BL instructions (how calls are made on arm64).
> - whether hand-coded non-leaf assembly functions can accidentally omit
> the frame pointer prologue setup;
Most of our assembly doesn't setup stackframes, and some of these are non-leaf,
e.g. __cpu_suspend_enter.
Also, I suspect our entry assembly may violate/confuse assumptions here. I've
been working to move more of that to C, but that isn't yet complete.
> - whether GCC can generally be relied upon to get arm64 frame pointers
> right, in both normal operation and edge cases.
>
> The commit log should also describe whether the unwinder itself can be
> considered reliable for all edge cases:
>
> - detection and reporting of preemption and page faults;
>
> - detection and recovery from function graph tracing;
>
> - detection and reporting of other unexpected conditions,
> including when the unwinder doesn't reach the end of the stack.
We may also have NMIs (with SDEI).
Thanks,
Mark.
On Mon, Oct 29, 2018 at 09:28:12AM +0000, Mark Rutland wrote:
> Hi Josh,
>
> I also have a few concerns here, as it is not clear to me precisely what is
> required from arch code. Is there any documentation I should look at?
The short answer is that we need:
1) Reliable frame pointers -- on x86 we do that with objtool:
tools/objtool/Documentation/stack-validation.txt
2) Reliable unwinder -- on x86 we had to rewrite the unwinder. There's
no documentation but the code is simple enough. See
unwind_next_frame() in arch/x86/kernel/unwind_frame.c and
__save_stack_trace_reliable() in arch/x86/kernel/stacktrace.c.
> On Fri, Oct 26, 2018 at 10:37:04AM -0500, Josh Poimboeuf wrote:
> > On Fri, Oct 26, 2018 at 04:21:57PM +0200, Torsten Duwe wrote:
> > > Enhance the stack unwinder so that it reports whether it had to stop
> > > normally or due to an error condition; unwind_frame() will report
> > > continue/error/normal ending and walk_stackframe() will pass that
> > > info. __save_stack_trace() is used to check the validity of a stack;
> > > save_stack_trace_tsk_reliable() can now trivially be implemented.
> > > Modify arch/arm64/kernel/time.c as the only external caller so far
> > > to recognise the new semantics.
>
> There are a number of error conditions not currently handled by the unwinder
> (mostly in the face of stack corruption), for which there have been prior
> discussions on list.
>
> Do we care about those cases, or do we consider things best-effort in the face
> of stack corruption?
The unwinder needs to be able to detect all stack corruption and return
an error.
[ But note that we don't need to worry about unwinding a task's stack
while the task is running, which can be a common source of
"corruption". For livepatch we make sure every task is blocked
(except when checking the current task). ]
It also needs to:
- detect preemption / page fault frames and return an error
- only return success if it reaches the end of the task stack; for user
tasks, that means the syscall barrier; for kthreads/idle tasks, that
means finding a defined thread entry point
- make sure it can't get into a recursive loop
- make sure each return address is a valid text address
- properly detect generated code hacks like function graph tracing and
kretprobes
> > > I had to introduce a marker symbol kthread_return_to_user to tell
> > > the normal origin of a kernel thread.
> > >
> > > Signed-off-by: Torsten Duwe <[email protected]>
> >
> > I haven't looked at the code, but the commit log doesn't inspire much
> > confidence. It's missing everything I previously asked for in the
> > powerpc version.
> >
> > There's zero mention of objtool. What analysis was done to indicate
> > that we can rely on frame pointers?
> >
> > Such a frame pointer analysis should be included in the commit log. It
> > should describe *at least* the following:
> >
> > - whether inline asm statements with call/branch instructions will
> > confuse GCC into skipping the frame pointer setup if it considers the
> > function to be a leaf function;
>
> There's a reasonable chance that the out-of-line LL/SC atomics could confuse
> GCC into thinking callers are leaf functions. That's the only inline asm that
> I'm aware of with BL instructions (how calls are made on arm64).
>
> > - whether hand-coded non-leaf assembly functions can accidentally omit
> > the frame pointer prologue setup;
>
> Most of our assembly doesn't setup stackframes, and some of these are non-leaf,
> e.g. __cpu_suspend_enter.
>
> Also, I suspect our entry assembly may violate/confuse assumptions here. I've
> been working to move more of that to C, but that isn't yet complete.
My experience with arm64 is very limited, but it sounds like it has some
of the same issues as x86. In which case we may need to port objtool to
arm64.
> > - whether GCC can generally be relied upon to get arm64 frame pointers
> > right, in both normal operation and edge cases.
> >
> > The commit log should also describe whether the unwinder itself can be
> > considered reliable for all edge cases:
> >
> > - detection and reporting of preemption and page faults;
> >
> > - detection and recovery from function graph tracing;
> >
> > - detection and reporting of other unexpected conditions,
> > including when the unwinder doesn't reach the end of the stack.
>
> We may also have NMIs (with SDEI).
NMIs shouldn't be an issue because livepatch only unwinds blocked tasks.
--
Josh
Hi Torsten,
On Fri, Oct 26, 2018 at 04:21:48PM +0200, Torsten Duwe wrote:
> Use -fpatchable-function-entry (gcc8) to add 2 NOPs at the beginning
> of each function. Replace the first NOP thus generated with a quick LR
> saver (move it to scratch reg x9), so the 2nd replacement insn, the call
> to ftrace, does not clobber the value. Ftrace will then generate the
> standard stack frames.
I'd like to understand why you need patchable-function-entry to
implement DYNAMIC_FTRACE_WITH_REGS on arm64. No other architectures with
DYNAMIC_FTRACE_WITH_REGS require this, including arm.
I guess skipping the original function prologue would simplify the
implementation of the replacement function (and would mean that the regs
held the function arguments per the procedure call standard), but AFAICT
other architectures aren't relying on that, so it doesn't seem to be a
strict requirement.
What am I missing?
How does livepatching handle the pre-mcount function preambles on
architectures with existing support?
FWIW, I think that patchable-function-entry would solve an upcoming
problem on arm64. To support in-kernel pointer authentication with the
function graph tracer, we need to snapshot the SP at function entry
(before any preamble), and we can't do that reliably with mcount.
Thanks,
Mark.
On Wed, 31 Oct 2018, Mark Rutland wrote:
> I guess skipping the original function prologue would simplify the
> implementation of the replacement function (and would mean that the regs
> held the function arguments per the procedure call standard), but AFAICT
> other architectures aren't relying on that, so it doesn't seem to be a
> strict requirement.
>
> What am I missing?
>
> How does livepatching handle the pre-mcount function preambles on
> architectures with existing support?
Other architectures do rely on that. That's exactly for example why on x86
we use '-pg -mfentry', to make sure we hook the function *before*
prologue.
Thanks,
--
Jiri Kosina
SUSE Labs
On Wed, Oct 31, 2018 at 02:19:07PM +0100, Jiri Kosina wrote:
> On Wed, 31 Oct 2018, Mark Rutland wrote:
>
> > I guess skipping the original function prologue would simplify the
> > implementation of the replacement function (and would mean that the regs
> > held the function arguments per the procedure call standard), but AFAICT
> > other architectures aren't relying on that, so it doesn't seem to be a
> > strict requirement.
> >
> > What am I missing?
> >
> > How does livepatching handle the pre-mcount function preambles on
> > architectures with existing support?
>
> Other architectures do rely on that. That's exactly for example why on x86
> we use '-pg -mfentry', to make sure we hook the function *before*
> prologue.
Ah, I'd missed -mfentry for x86. I now see that's also the case with
__gnu_mcount_nc on arch/arm, so that covers my confusion.
Thanks for correcting me, and sorry for noise!
Mark.
On Wed, 31 Oct 2018 14:18:19 +0000
Mark Rutland <[email protected]> wrote:
> On Wed, Oct 31, 2018 at 02:19:07PM +0100, Jiri Kosina wrote:
> > Other architectures do rely on that. That's exactly for example why
> > on x86 we use '-pg -mfentry', to make sure we hook the function
> > *before* prologue.
>
> Ah, I'd missed -mfentry for x86. I now see that's also the case with
> __gnu_mcount_nc on arch/arm, so that covers my confusion.
Yes, fentry used to be the prerequisite, but it's everything but
portable. PPC64 already had the profile-kernel switch, which was
becoming just usable as we got at live patching.
I'm hoping that the patchable-function-entry will become the future
de-facto standard.
Torsten
Hi,
On Fri, 26 Oct 2018, Torsten Duwe wrote:
> Based on ftrace with regs, do the usual thing.
> (see Documentation/livepatch/livepatch.txt)
>
> Use task flag bit 6 to track patch transisiton state for the consistency
> model. Add it to the work mask so it gets cleared on all kernel exits to
> userland.
>
> Tell livepatch regs->pc is the place to change the return address.
> Make sure the graph tracer call hook is only called on the final function
> entry in case regs->pc gets modified after an interception.
>
> Signed-off-by: Torsten Duwe <[email protected]>
It looks good now apart from arm64 asm part which should be reviewed by
someone else.
However, could you summarize our analysis regarding post-module-load calls
of apply_relocate_add() in the commit log, please? It is important for
future reference.
Thanks,
Miroslav
Hi Torsten,
On 26 October 2018 at 16:21, Torsten Duwe <[email protected]> wrote:
> Use -fpatchable-function-entry (gcc8) to add 2 NOPs at the beginning
> of each function. Replace the first NOP thus generated with a quick LR
> saver (move it to scratch reg x9), so the 2nd replacement insn, the call
> to ftrace, does not clobber the value. Ftrace will then generate the
> standard stack frames.
>
> Note that patchable-function-entry in GCC disables IPA-RA, which means
> ABI register calling conventions are obeyed *and* scratch registers
> such as x9 are available.
>
> Introduce and handle an ftrace_regs_trampoline for module PLTs, right
> after ftrace_trampoline, and double the size of this special section
> if .text.ftrace_trampoline is present in the module.
>
> Signed-off-by: Torsten Duwe <[email protected]>
>
> --- a/arch/arm64/Kconfig
> +++ b/arch/arm64/Kconfig
> @@ -110,6 +110,8 @@ config ARM64
> select HAVE_DEBUG_KMEMLEAK
> select HAVE_DMA_CONTIGUOUS
> select HAVE_DYNAMIC_FTRACE
> + select HAVE_DYNAMIC_FTRACE_WITH_REGS \
> + if $(cc-option,-fpatchable-function-entry=2)
> select HAVE_EFFICIENT_UNALIGNED_ACCESS
> select HAVE_FTRACE_MCOUNT_RECORD
> select HAVE_FUNCTION_TRACER
> --- a/arch/arm64/Makefile
> +++ b/arch/arm64/Makefile
> @@ -78,6 +78,10 @@ ifeq ($(CONFIG_ARM64_MODULE_PLTS),y)
> KBUILD_LDFLAGS_MODULE += -T $(srctree)/arch/arm64/kernel/module.lds
> endif
>
> +ifeq ($(CONFIG_DYNAMIC_FTRACE_WITH_REGS),y)
> + CC_FLAGS_FTRACE := -fpatchable-function-entry=2
> +endif
> +
> # Default value
> head-y := arch/arm64/kernel/head.o
>
> --- a/arch/arm64/include/asm/ftrace.h
> +++ b/arch/arm64/include/asm/ftrace.h
> @@ -16,6 +16,19 @@
> #define MCOUNT_ADDR ((unsigned long)_mcount)
> #define MCOUNT_INSN_SIZE AARCH64_INSN_SIZE
>
> +/*
> + * DYNAMIC_FTRACE_WITH_REGS is implemented by adding 2 NOPs at the beginning
> + * of each function, with the second NOP actually calling ftrace. In contrary
> + * to a classic _mcount call, the call instruction to be modified is thus
> + * the second one, and not the only one.
> + */
> +#ifdef CONFIG_DYNAMIC_FTRACE_WITH_REGS
> +#define ARCH_SUPPORTS_FTRACE_OPS 1
> +#define REC_IP_BRANCH_OFFSET AARCH64_INSN_SIZE
> +#else
> +#define REC_IP_BRANCH_OFFSET 0
> +#endif
> +
> #ifndef __ASSEMBLY__
> #include <linux/compat.h>
>
> --- a/arch/arm64/kernel/Makefile
> +++ b/arch/arm64/kernel/Makefile
> @@ -7,9 +7,9 @@ CPPFLAGS_vmlinux.lds := -DTEXT_OFFSET=$(
> AFLAGS_head.o := -DTEXT_OFFSET=$(TEXT_OFFSET)
> CFLAGS_armv8_deprecated.o := -I$(src)
>
> -CFLAGS_REMOVE_ftrace.o = -pg
> -CFLAGS_REMOVE_insn.o = -pg
> -CFLAGS_REMOVE_return_address.o = -pg
> +CFLAGS_REMOVE_ftrace.o = -pg $(CC_FLAGS_FTRACE)
> +CFLAGS_REMOVE_insn.o = -pg $(CC_FLAGS_FTRACE)
> +CFLAGS_REMOVE_return_address.o = -pg $(CC_FLAGS_FTRACE)
>
> # Object file lists.
> arm64-obj-y := debug-monitors.o entry.o irq.o fpsimd.o \
> --- a/drivers/firmware/efi/libstub/Makefile
> +++ b/drivers/firmware/efi/libstub/Makefile
> @@ -11,7 +11,8 @@ cflags-$(CONFIG_X86) += -m$(BITS) -D__K
> -fPIC -fno-strict-aliasing -mno-red-zone \
> -mno-mmx -mno-sse -fshort-wchar
>
> -cflags-$(CONFIG_ARM64) := $(subst -pg,,$(KBUILD_CFLAGS)) -fpie
> +cflags-$(CONFIG_ARM64) := $(filter-out -pg $(CC_FLAGS_FTRACE)\
> + ,$(KBUILD_CFLAGS)) -fpie
> cflags-$(CONFIG_ARM) := $(subst -pg,,$(KBUILD_CFLAGS)) \
> -fno-builtin -fpic -mno-single-pic-base
>
> --- a/arch/arm64/kernel/entry-ftrace.S
> +++ b/arch/arm64/kernel/entry-ftrace.S
> @@ -13,6 +13,8 @@
> #include <asm/assembler.h>
> #include <asm/ftrace.h>
> #include <asm/insn.h>
> +#include <asm/asm-offsets.h>
> +#include <asm/assembler.h>
>
> /*
> * Gcc with -pg will put the following code in the beginning of each function:
> @@ -123,6 +125,7 @@ skip_ftrace_call: // }
> ENDPROC(_mcount)
>
> #else /* CONFIG_DYNAMIC_FTRACE */
> +#ifndef CONFIG_DYNAMIC_FTRACE_WITH_REGS
> /*
> * _mcount() is used to build the kernel with -pg option, but all the branch
> * instructions to _mcount() are replaced to NOP initially at kernel start up,
> @@ -162,6 +165,114 @@ ftrace_graph_call: // ftrace_graph_cal
>
> mcount_exit
> ENDPROC(ftrace_caller)
> +#else /* CONFIG_DYNAMIC_FTRACE_WITH_REGS */
> +
> +/*
> + * Since no -pg or similar compiler flag is used, there should really be
> + * no reference to _mcount; so do not define one. Only some value for
> + * MCOUNT_ADDR is needed for comparison. Let it point here to have some
> + * sort of magic value that can be recognised when debugging.
> + */
> + .global _mcount
> +_mcount:
> + ret /* make it differ from regs caller */
> +
> +ENTRY(ftrace_regs_caller)
> + /* callee's preliminary stack frame: */
> + stp fp, x9, [sp, #-16]!
Does the 'fp' alias for x29 work with older assemblers? I guess it
does not matter gor GCC 8+ code, but be careful when you rewrite
existing stuff.
> + mov fp, sp
> +
> + /* our stack frame: */
> + stp fp, lr, [sp, #-S_FRAME_SIZE]!
If sizeof(struct pt_regs) == S_FRAME_SIZE), you should subtract 16
additional bytes here
stp fp, lr, [sp, #-(S_FRAME_SIZE + 16)]!
> + add x9, sp, #16 /* offset to pt_regs */
> +
... since that is the offset you use here
> + stp x10, x11, [x9, #S_X10]
> + stp x12, x13, [x9, #S_X12]
> + stp x14, x15, [x9, #S_X14]
> + stp x16, x17, [x9, #S_X16]
> + stp x18, x19, [x9, #S_X18]
> + stp x20, x21, [x9, #S_X20]
> + stp x22, x23, [x9, #S_X22]
> + stp x24, x25, [x9, #S_X24]
> + stp x26, x27, [x9, #S_X26]
> +
> + b ftrace_common
> +ENDPROC(ftrace_regs_caller)
> +
> +ENTRY(ftrace_caller)
> + /* callee's preliminary stack frame: */
> + stp fp, x9, [sp, #-16]!
> + mov fp, sp
> +
> + /* our stack frame: */
> + stp fp, lr, [sp, #-S_FRAME_SIZE]!
> + add x9, sp, #16 /* offset to pt_regs */
Same here
> +
> +ftrace_common:
> + /*
> + * At this point we have 2 new stack frames, and x9 pointing
> + * at a pt_regs which we can populate as needed.
> + */
> +
> + /* save function arguments */
> + stp x0, x1, [x9]
> + stp x2, x3, [x9, #S_X2]
> + stp x4, x5, [x9, #S_X4]
> + stp x6, x7, [x9, #S_X6]
> + stp x8, x9, [x9, #S_X8]
> +
x9 is not a function argument, and if it were, you'd have clobbered it
by now. Please use a single 'str' and store x8 only
> + ldr x0, [fp]
> + stp x28, x0, [x9, #S_X28] /* FP in pt_regs + "our" x28 */
> +
> + /* The program counter just after the ftrace call site */
> + str lr, [x9, #S_PC]
> + /* The stack pointer as it was on ftrace_caller entry... */
> + add x28, fp, #16
> + str x28, [x9, #S_SP]
> + /* The link Register at callee entry */
> + ldr x28, [fp, 8]
> + str x28, [x9, #S_LR] /* to pt_regs.r[30] */
> +
> + ldr_l x2, function_trace_op, x0
> + ldr x1, [fp, #8]
> + sub x0, lr, #8 /* function entry == IP */
> + mov x3, x9 /* pt_regs are @x9 */
> +
> + mov fp, sp
> +
> + .global ftrace_call
> +ftrace_call:
> +
> + bl ftrace_stub
> +
> +#ifdef CONFIG_FUNCTION_GRAPH_TRACER
> + .global ftrace_graph_call
> +ftrace_graph_call: // ftrace_graph_caller();
> + nop // If enabled, this will be replaced
> + // "b ftrace_graph_caller"
> +#endif
> +
> +ftrace_common_return:
> + add x9, sp, #16 /* advance to pt_regs for restore */
> +
> + ldp x0, x1, [x9]
> + ldp x2, x3, [x9, #S_X2]
> + ldp x4, x5, [x9, #S_X4]
> + ldp x6, x7, [x9, #S_X6]
> + ldp x8, x9, [x9, #S_X8]
> +
Same as above. It also deserves a mention that you are relying on the
absence of IPA-RA, and so x9..x18 are guaranteed to be dead at
function entry, and so they don't need to be restored here.
> + ldp x28, fp, [x9, #S_X28]
> +
> + ldr lr, [x9, #S_LR]
> + ldr x9, [x9, #S_PC]
> + /* clean up both frames, ours and callee preliminary */
> + add sp, sp, #S_FRAME_SIZE + 16
> +
> + ret x9
> +
> +ENDPROC(ftrace_caller)
> +
> +#endif /* CONFIG_DYNAMIC_FTRACE_WITH_REGS */
> #endif /* CONFIG_DYNAMIC_FTRACE */
>
> ENTRY(ftrace_stub)
> @@ -197,12 +308,21 @@ ENDPROC(ftrace_stub)
> * and run return_to_handler() later on its exit.
> */
> ENTRY(ftrace_graph_caller)
> +#ifndef CONFIG_DYNAMIC_FTRACE_WITH_REGS
> mcount_get_lr_addr x0 // pointer to function's saved lr
> mcount_get_pc x1 // function's pc
> mcount_get_parent_fp x2 // parent's fp
> bl prepare_ftrace_return // prepare_ftrace_return(&lr, pc, fp)
>
> mcount_exit
> +#else
> + add x9, sp, #16 /* advance to pt_regs to gather args */
> + add x0, x9, #S_LR /* &lr */
> + ldr x1, [x9, #S_PC] /* pc */
> + ldr x2, [x9, #S_STACKFRAME] /* fp */
> + bl prepare_ftrace_return
> + b ftrace_common_return
> +#endif
> ENDPROC(ftrace_graph_caller)
>
> /*
> --- a/arch/arm64/kernel/ftrace.c
> +++ b/arch/arm64/kernel/ftrace.c
> @@ -65,18 +65,61 @@ int ftrace_update_ftrace_func(ftrace_fun
> return ftrace_modify_code(pc, 0, new, false);
> }
>
> +#ifdef CONFIG_ARM64_MODULE_PLTS
> +static int install_ftrace_trampoline(struct module *mod, unsigned long *addr)
> +{
> + struct plt_entry trampoline, *mod_trampoline;
> + trampoline = get_plt_entry(*addr);
> +
> + if (*addr == FTRACE_ADDR)
> + mod_trampoline = mod->arch.ftrace_trampoline;
> + else if (*addr == FTRACE_REGS_ADDR)
> + mod_trampoline = mod->arch.ftrace_regs_trampoline;
Could we do something like
if (*addr == FTRACE_ADDR)
mod_trampoline = &mod->arch.ftrace_trampoline[0];
else if (*addr == FTRACE_REGS_ADDR)
mod_trampoline = &mod->arch.ftrace_trampoline[1];
and get rid of the additional struct field and pointer?
> + else
> + return -EINVAL;
> +
> + if (!plt_entries_equal(mod_trampoline, &trampoline)) {
> +
> + /* point the trampoline to our ftrace entry point */
> + module_disable_ro(mod);
> + *mod_trampoline = trampoline;
> + module_enable_ro(mod, true);
> +
> + /* update trampoline before patching in the branch */
> + smp_wmb();
> + }
> + *addr = (unsigned long)(void *)mod_trampoline;
> +
> + return 0;
> +}
> +#endif
> +
> +/*
> + * Ftrace with regs generates the tracer calls as close as possible to
> + * the function entry; no stack frame has been set up at that point.
> + * In order to make another call e.g to ftrace_caller, the LR must be
> + * saved from being overwritten.
> + * Between two functions, and with IPA-RA turned off, the scratch registers
> + * are available, so move the LR to x9 before calling into ftrace.
> + * "mov x9, lr" is officially aliased from "orr x9, xzr, lr".
> + */
> +#define QUICK_LR_SAVE aarch64_insn_gen_logical_shifted_reg( \
> + AARCH64_INSN_REG_9, AARCH64_INSN_REG_ZR, \
> + AARCH64_INSN_REG_LR, 0, AARCH64_INSN_VARIANT_64BIT, \
> + AARCH64_INSN_LOGIC_ORR)
> +
> /*
> * Turn on the call to ftrace_caller() in instrumented function
> */
> int ftrace_make_call(struct dyn_ftrace *rec, unsigned long addr)
> {
> - unsigned long pc = rec->ip;
> + unsigned long pc = rec->ip + REC_IP_BRANCH_OFFSET;
> + int ret;
> u32 old, new;
> long offset = (long)pc - (long)addr;
>
> if (offset < -SZ_128M || offset >= SZ_128M) {
> #ifdef CONFIG_ARM64_MODULE_PLTS
> - struct plt_entry trampoline;
> struct module *mod;
>
> /*
> @@ -96,54 +139,65 @@ int ftrace_make_call(struct dyn_ftrace *
> if (WARN_ON(!mod))
> return -EINVAL;
>
> - /*
> - * There is only one ftrace trampoline per module. For now,
> - * this is not a problem since on arm64, all dynamic ftrace
> - * invocations are routed via ftrace_caller(). This will need
> - * to be revisited if support for multiple ftrace entry points
> - * is added in the future, but for now, the pr_err() below
> - * deals with a theoretical issue only.
> - */
> - trampoline = get_plt_entry(addr);
> - if (!plt_entries_equal(mod->arch.ftrace_trampoline,
> - &trampoline)) {
> - if (!plt_entries_equal(mod->arch.ftrace_trampoline,
> - &(struct plt_entry){})) {
> - pr_err("ftrace: far branches to multiple entry points unsupported inside a single module\n");
> - return -EINVAL;
> - }
> -
> - /* point the trampoline to our ftrace entry point */
> - module_disable_ro(mod);
> - *mod->arch.ftrace_trampoline = trampoline;
> - module_enable_ro(mod, true);
> -
> - /* update trampoline before patching in the branch */
> - smp_wmb();
> + /* Check against our well-known list of ftrace entry points */
> + if (addr == FTRACE_ADDR || addr == FTRACE_REGS_ADDR) {
> + ret = install_ftrace_trampoline(mod, &addr);
> + if (ret < 0)
> + return ret;
> }
> - addr = (unsigned long)(void *)mod->arch.ftrace_trampoline;
> + else
> + return -EINVAL;
> +
> #else /* CONFIG_ARM64_MODULE_PLTS */
> return -EINVAL;
> #endif /* CONFIG_ARM64_MODULE_PLTS */
> }
>
> old = aarch64_insn_gen_nop();
> + if (IS_ENABLED(CONFIG_DYNAMIC_FTRACE_WITH_REGS)) {
> + new = QUICK_LR_SAVE;
> + ret = ftrace_modify_code(pc - AARCH64_INSN_SIZE,
> + old, new, true);
> + if (ret)
> + return ret;
> + }
> new = aarch64_insn_gen_branch_imm(pc, addr, AARCH64_INSN_BRANCH_LINK);
>
> return ftrace_modify_code(pc, old, new, true);
> }
>
> +#ifdef CONFIG_DYNAMIC_FTRACE_WITH_REGS
> +int ftrace_modify_call(struct dyn_ftrace *rec, unsigned long old_addr,
> + unsigned long addr)
> +{
> + unsigned long pc = rec->ip + REC_IP_BRANCH_OFFSET;
> + u32 old, new;
> +
> + old = aarch64_insn_gen_branch_imm(pc, old_addr, true);
> + new = aarch64_insn_gen_branch_imm(pc, addr, true);
> +
> + return ftrace_modify_code(pc, old, new, true);
> +}
> +#endif
> +
> /*
> * Turn off the call to ftrace_caller() in instrumented function
> */
> int ftrace_make_nop(struct module *mod, struct dyn_ftrace *rec,
> unsigned long addr)
> {
> - unsigned long pc = rec->ip;
> + unsigned long pc = rec->ip + REC_IP_BRANCH_OFFSET;
> bool validate = true;
> + int ret;
> u32 old = 0, new;
> long offset = (long)pc - (long)addr;
>
> + /* -fpatchable-function-entry= does not generate a profiling call
> + * initially; the NOPs are already there.
> + */
> + if (IS_ENABLED(CONFIG_DYNAMIC_FTRACE_WITH_REGS) && addr == MCOUNT_ADDR)
> + return 0;
> +
> if (offset < -SZ_128M || offset >= SZ_128M) {
> #ifdef CONFIG_ARM64_MODULE_PLTS
> u32 replaced;
> @@ -188,7 +242,15 @@ int ftrace_make_nop(struct module *mod,
>
> new = aarch64_insn_gen_nop();
>
> - return ftrace_modify_code(pc, old, new, validate);
> + ret = ftrace_modify_code(pc, old, new, validate);
> + if (ret)
> + return ret;
> + if (IS_ENABLED(CONFIG_DYNAMIC_FTRACE_WITH_REGS)) {
> + old = QUICK_LR_SAVE;
> + ret = ftrace_modify_code(pc - AARCH64_INSN_SIZE,
> + old, new, true);
> + }
> + return ret;
> }
>
> void arch_ftrace_update_code(int command)
> --- a/include/asm-generic/vmlinux.lds.h
> +++ b/include/asm-generic/vmlinux.lds.h
> @@ -115,6 +115,7 @@
> #define MCOUNT_REC() . = ALIGN(8); \
> __start_mcount_loc = .; \
> KEEP(*(__mcount_loc)) \
> + KEEP(*(__patchable_function_entries)) \
> __stop_mcount_loc = .;
> #else
> #define MCOUNT_REC()
> --- a/include/linux/compiler_types.h
> +++ b/include/linux/compiler_types.h
> @@ -61,8 +61,12 @@ extern void __chk_io_ptr(const volatile
> #if defined(CC_USING_HOTPATCH) && !defined(__CHECKER__)
> #define notrace __attribute__((hotpatch(0,0)))
> #else
> +#if defined(CONFIG_ARM64) && defined(CONFIG_DYNAMIC_FTRACE_WITH_REGS)
> +#define notrace __attribute__((patchable_function_entry(0)))
> +#else
> #define notrace __attribute__((no_instrument_function))
> #endif
> +#endif
>
> /* Intel compiler defines __GNUC__. So we will overwrite implementations
> * coming from above header files here
> --- a/arch/arm64/include/asm/module.h
> +++ b/arch/arm64/include/asm/module.h
> @@ -33,6 +33,7 @@ struct mod_arch_specific {
>
> /* for CONFIG_DYNAMIC_FTRACE */
> struct plt_entry *ftrace_trampoline;
> + struct plt_entry *ftrace_regs_trampoline;
> };
> #endif
>
> --- a/arch/arm64/kernel/module.c
> +++ b/arch/arm64/kernel/module.c
> @@ -452,8 +452,11 @@ int module_finalize(const Elf_Ehdr *hdr,
> apply_alternatives_module((void *)s->sh_addr, s->sh_size);
> #ifdef CONFIG_ARM64_MODULE_PLTS
> if (IS_ENABLED(CONFIG_DYNAMIC_FTRACE) &&
> - !strcmp(".text.ftrace_trampoline", secstrs + s->sh_name))
> + !strcmp(".text.ftrace_trampoline", secstrs + s->sh_name)) {
> me->arch.ftrace_trampoline = (void *)s->sh_addr;
> + me->arch.ftrace_regs_trampoline =
> + (void *)(s->sh_addr + sizeof(struct plt_entry));
> + }
> #endif
> }
>
> --- a/arch/arm64/kernel/module-plts.c
> +++ b/arch/arm64/kernel/module-plts.c
> @@ -272,7 +272,7 @@ int module_frob_arch_sections(Elf_Ehdr *
> tramp->sh_type = SHT_NOBITS;
> tramp->sh_flags = SHF_EXECINSTR | SHF_ALLOC;
> tramp->sh_addralign = __alignof__(struct plt_entry);
> - tramp->sh_size = sizeof(struct plt_entry);
> + tramp->sh_size = 2 * sizeof(struct plt_entry);
> }
>
> return 0;
On 26 October 2018 at 16:21, Torsten Duwe <[email protected]> wrote:
> Based on ftrace with regs, do the usual thing.
> (see Documentation/livepatch/livepatch.txt)
>
> Use task flag bit 6 to track patch transisiton state for the consistency
> model. Add it to the work mask so it gets cleared on all kernel exits to
> userland.
>
> Tell livepatch regs->pc is the place to change the return address.
> Make sure the graph tracer call hook is only called on the final function
> entry in case regs->pc gets modified after an interception.
>
> Signed-off-by: Torsten Duwe <[email protected]>
>
> --- a/arch/arm64/Kconfig
> +++ b/arch/arm64/Kconfig
> @@ -120,6 +120,7 @@ config ARM64
> select HAVE_GENERIC_DMA_COHERENT
> select HAVE_HW_BREAKPOINT if PERF_EVENTS
> select HAVE_IRQ_TIME_ACCOUNTING
> + select HAVE_LIVEPATCH
> select HAVE_MEMBLOCK
> select HAVE_MEMBLOCK_NODE_MAP if NUMA
> select HAVE_NMI
> @@ -1350,4 +1351,6 @@ if CRYPTO
> source "arch/arm64/crypto/Kconfig"
> endif
>
> +source "kernel/livepatch/Kconfig"
> +
> source "lib/Kconfig"
> --- a/arch/arm64/include/asm/thread_info.h
> +++ b/arch/arm64/include/asm/thread_info.h
> @@ -76,6 +76,7 @@ void arch_release_task_struct(struct tas
> #define TIF_FOREIGN_FPSTATE 3 /* CPU's FP state is not current's */
> #define TIF_UPROBE 4 /* uprobe breakpoint or singlestep */
> #define TIF_FSCHECK 5 /* Check FS is USER_DS on return */
> +#define TIF_PATCH_PENDING 6
> #define TIF_NOHZ 7
> #define TIF_SYSCALL_TRACE 8
> #define TIF_SYSCALL_AUDIT 9
> @@ -94,6 +95,7 @@ void arch_release_task_struct(struct tas
> #define _TIF_NEED_RESCHED (1 << TIF_NEED_RESCHED)
> #define _TIF_NOTIFY_RESUME (1 << TIF_NOTIFY_RESUME)
> #define _TIF_FOREIGN_FPSTATE (1 << TIF_FOREIGN_FPSTATE)
> +#define _TIF_PATCH_PENDING (1 << TIF_PATCH_PENDING)
> #define _TIF_NOHZ (1 << TIF_NOHZ)
> #define _TIF_SYSCALL_TRACE (1 << TIF_SYSCALL_TRACE)
> #define _TIF_SYSCALL_AUDIT (1 << TIF_SYSCALL_AUDIT)
> @@ -106,7 +108,8 @@ void arch_release_task_struct(struct tas
>
> #define _TIF_WORK_MASK (_TIF_NEED_RESCHED | _TIF_SIGPENDING | \
> _TIF_NOTIFY_RESUME | _TIF_FOREIGN_FPSTATE | \
> - _TIF_UPROBE | _TIF_FSCHECK)
> + _TIF_UPROBE | _TIF_FSCHECK | \
> + _TIF_PATCH_PENDING)
>
> #define _TIF_SYSCALL_WORK (_TIF_SYSCALL_TRACE | _TIF_SYSCALL_AUDIT | \
> _TIF_SYSCALL_TRACEPOINT | _TIF_SECCOMP | \
> --- /dev/null
> +++ b/arch/arm64/include/asm/livepatch.h
> @@ -0,0 +1,35 @@
> +/* SPDX-License-Identifier: GPL-2.0
> + *
> + * livepatch.h - arm64-specific Kernel Live Patching Core
> + *
> + * Copyright (C) 2016,2018 SUSE
> + *
> + * This program is free software; you can redistribute it and/or
> + * modify it under the terms of the GNU General Public License
> + * as published by the Free Software Foundation; either version 2
> + * of the License, or (at your option) any later version.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
> + * GNU General Public License for more details.
> + *
> + * You should have received a copy of the GNU General Public License
> + * along with this program; if not, see <http://www.gnu.org/licenses/>.
> + */
> +#ifndef _ASM_ARM64_LIVEPATCH_H
> +#define _ASM_ARM64_LIVEPATCH_H
> +
> +#include <asm/ptrace.h>
> +
> +static inline int klp_check_compiler_support(void)
> +{
> + return 0;
> +}
> +
> +static inline void klp_arch_set_pc(struct pt_regs *regs, unsigned long ip)
> +{
> + regs->pc = ip;
> +}
> +
> +#endif /* _ASM_ARM64_LIVEPATCH_H */
> --- a/arch/arm64/kernel/entry-ftrace.S
> +++ b/arch/arm64/kernel/entry-ftrace.S
> @@ -226,6 +226,7 @@ ftrace_common:
>
> /* The program counter just after the ftrace call site */
> str lr, [x9, #S_PC]
> +
> /* The stack pointer as it was on ftrace_caller entry... */
> add x28, fp, #16
> str x28, [x9, #S_SP]
Please drop this hunk
> @@ -233,6 +234,10 @@ ftrace_common:
> ldr x28, [fp, 8]
> str x28, [x9, #S_LR] /* to pt_regs.r[30] */
>
> +#if defined(CONFIG_LIVEPATCH) && defined(CONFIG_FUNCTION_GRAPH_TRACER)
> + mov x28, lr /* remember old return address */
> +#endif
> +
> ldr_l x2, function_trace_op, x0
> ldr x1, [fp, #8]
> sub x0, lr, #8 /* function entry == IP */
> @@ -245,6 +250,17 @@ ftrace_call:
>
> bl ftrace_stub
>
> +#if defined(CONFIG_LIVEPATCH) && defined(CONFIG_FUNCTION_GRAPH_TRACER)
> + /* Is the trace function a live patcher an has messed with
> + * the return address?
> + */
> + add x9, sp, #16 /* advance to pt_regs for restore */
> + ldr x0, [x9, #S_PC]
> + cmp x0, x28 /* compare with the value we remembered */
> + /* to not call graph tracer's "call" mechanism twice! */
> + b.ne ftrace_common_return
Is ftrace_common_return guaranteed to be in range? Conditional
branches have only -/+ 1 MB range IIRC.
Better to do
b.eq ftrace_graph_call
b ftrace_common_return
to be sure
> +#endif
> +
> #ifdef CONFIG_FUNCTION_GRAPH_TRACER
Can we fold these #ifdef blocks together (i.e, incorporate the
conditional livepatch sequence here)
> .global ftrace_graph_call
> ftrace_graph_call: // ftrace_graph_caller();
> --- a/arch/arm64/kernel/signal.c
> +++ b/arch/arm64/kernel/signal.c
> @@ -29,6 +29,7 @@
> #include <linux/sizes.h>
> #include <linux/string.h>
> #include <linux/tracehook.h>
> +#include <linux/livepatch.h>
> #include <linux/ratelimit.h>
> #include <linux/syscalls.h>
>
> @@ -934,6 +935,9 @@ asmlinkage void do_notify_resume(struct
> if (thread_flags & _TIF_UPROBE)
> uprobe_notify_resume(regs);
>
> + if (thread_flags & _TIF_PATCH_PENDING)
> + klp_update_patch_state(current);
> +
> if (thread_flags & _TIF_SIGPENDING)
> do_signal(regs);
>
On Thu, Nov 08, 2018 at 01:42:35PM +0100, Ard Biesheuvel wrote:
> On 26 October 2018 at 16:21, Torsten Duwe <[email protected]> wrote:
> > /* The program counter just after the ftrace call site */
> > str lr, [x9, #S_PC]
> > +
> > /* The stack pointer as it was on ftrace_caller entry... */
> > add x28, fp, #16
> > str x28, [x9, #S_SP]
>
> Please drop this hunk
Sure. I missed that one during cleanup.
> > @@ -233,6 +234,10 @@ ftrace_common:
^^^^^^^^^^^^^^
> > ldr x28, [fp, 8]
> > str x28, [x9, #S_LR] /* to pt_regs.r[30] */
> >
> > +#if defined(CONFIG_LIVEPATCH) && defined(CONFIG_FUNCTION_GRAPH_TRACER)
> > + mov x28, lr /* remember old return address */
> > +#endif
> > +
> > ldr_l x2, function_trace_op, x0
> > ldr x1, [fp, #8]
> > sub x0, lr, #8 /* function entry == IP */
> > @@ -245,6 +250,17 @@ ftrace_call:
> >
> > bl ftrace_stub
> >
> > +#if defined(CONFIG_LIVEPATCH) && defined(CONFIG_FUNCTION_GRAPH_TRACER)
> > + /* Is the trace function a live patcher an has messed with
> > + * the return address?
> > + */
> > + add x9, sp, #16 /* advance to pt_regs for restore */
> > + ldr x0, [x9, #S_PC]
> > + cmp x0, x28 /* compare with the value we remembered */
> > + /* to not call graph tracer's "call" mechanism twice! */
> > + b.ne ftrace_common_return
>
> Is ftrace_common_return guaranteed to be in range? Conditional
> branches have only -/+ 1 MB range IIRC.
It's the same function. A "1f" would do the same job, but the long label
is a talking identifier that saves a comment. I'd more be worried about
the return from the graph trace caller, which happens to be the _next_
function ;-)
If ftrace_caller or graph_caller grow larger than a meg, something else is
_very_ wrong.
> > +#endif
> > +
> > #ifdef CONFIG_FUNCTION_GRAPH_TRACER
>
> Can we fold these #ifdef blocks together (i.e, incorporate the
> conditional livepatch sequence here)
I'll see how to make it fit. But remember some people might want ftrace
but no live patching capability.
Thanks for the review!
Torsten
On Mon, 12 Nov 2018 at 12:01, Torsten Duwe <[email protected]> wrote:
>
> On Thu, Nov 08, 2018 at 01:42:35PM +0100, Ard Biesheuvel wrote:
> > On 26 October 2018 at 16:21, Torsten Duwe <[email protected]> wrote:
> > > /* The program counter just after the ftrace call site */
> > > str lr, [x9, #S_PC]
> > > +
> > > /* The stack pointer as it was on ftrace_caller entry... */
> > > add x28, fp, #16
> > > str x28, [x9, #S_SP]
> >
> > Please drop this hunk
>
> Sure. I missed that one during cleanup.
>
> > > @@ -233,6 +234,10 @@ ftrace_common:
> ^^^^^^^^^^^^^^
> > > ldr x28, [fp, 8]
> > > str x28, [x9, #S_LR] /* to pt_regs.r[30] */
> > >
> > > +#if defined(CONFIG_LIVEPATCH) && defined(CONFIG_FUNCTION_GRAPH_TRACER)
> > > + mov x28, lr /* remember old return address */
> > > +#endif
> > > +
> > > ldr_l x2, function_trace_op, x0
> > > ldr x1, [fp, #8]
> > > sub x0, lr, #8 /* function entry == IP */
> > > @@ -245,6 +250,17 @@ ftrace_call:
> > >
> > > bl ftrace_stub
> > >
> > > +#if defined(CONFIG_LIVEPATCH) && defined(CONFIG_FUNCTION_GRAPH_TRACER)
> > > + /* Is the trace function a live patcher an has messed with
> > > + * the return address?
> > > + */
> > > + add x9, sp, #16 /* advance to pt_regs for restore */
> > > + ldr x0, [x9, #S_PC]
> > > + cmp x0, x28 /* compare with the value we remembered */
> > > + /* to not call graph tracer's "call" mechanism twice! */
> > > + b.ne ftrace_common_return
> >
> > Is ftrace_common_return guaranteed to be in range? Conditional
> > branches have only -/+ 1 MB range IIRC.
>
> It's the same function. A "1f" would do the same job, but the long label
> is a talking identifier that saves a comment. I'd more be worried about
> the return from the graph trace caller, which happens to be the _next_
> function ;-)
>
> If ftrace_caller or graph_caller grow larger than a meg, something else is
> _very_ wrong.
>
Ah ok. I confused myself into thinking that ftrace_common_return() was
defined in another compilation unit
> > > +#endif
> > > +
> > > #ifdef CONFIG_FUNCTION_GRAPH_TRACER
> >
> > Can we fold these #ifdef blocks together (i.e, incorporate the
> > conditional livepatch sequence here)
>
> I'll see how to make it fit. But remember some people might want ftrace
> but no live patching capability.
>
Sure. I simply mean turning this
#if defined(CONFIG_LIVEPATCH) && defined(CONFIG_FUNCTION_GRAPH_TRACER)
<bla>
#endif
#ifdef CONFIG_FUNCTION_GRAPH_TRACER
<bla bla>
#endif
into
#ifdef CONFIG_FUNCTION_GRAPH_TRACER
#ifdef CONFIG_LIVEPATCH
<bla>
#endif
<bla bla>
#endif
On Thu, Nov 08, 2018 at 01:12:42PM +0100, Ard Biesheuvel wrote:
>
> On 26 October 2018 at 16:21, Torsten Duwe <[email protected]> wrote:
> > @@ -162,6 +165,114 @@ ftrace_graph_call: // ftrace_graph_cal
> >
> > mcount_exit
> > ENDPROC(ftrace_caller)
> > +#else /* CONFIG_DYNAMIC_FTRACE_WITH_REGS */
> > +
> > +/*
> > + * Since no -pg or similar compiler flag is used, there should really be
> > + * no reference to _mcount; so do not define one. Only some value for
> > + * MCOUNT_ADDR is needed for comparison. Let it point here to have some
> > + * sort of magic value that can be recognised when debugging.
> > + */
> > + .global _mcount
> > +_mcount:
> > + ret /* make it differ from regs caller */
> > +
> > +ENTRY(ftrace_regs_caller)
> > + /* callee's preliminary stack frame: */
> > + stp fp, x9, [sp, #-16]!
>
> Does the 'fp' alias for x29 work with older assemblers? I guess it
> does not matter gor GCC 8+ code, but be careful when you rewrite
> existing stuff.
I had gotten the impression the fp alias was there ever since, so I used
it for readability. Thanks for the notification, I'll double check.
> > + mov fp, sp
> > +
> > + /* our stack frame: */
> > + stp fp, lr, [sp, #-S_FRAME_SIZE]!
>
> If sizeof(struct pt_regs) == S_FRAME_SIZE), you should subtract 16
> additional bytes here
This is intentional :-]
At the end of pt_regs there's a "stackframe", which now aligns with the
"preliminary" frame I create for the callee. Please tell me what the struct
member is good for if not for an actual callee stack frame...
I thought it was a neat idea.
> > +
> > +ftrace_common:
> > + /*
> > + * At this point we have 2 new stack frames, and x9 pointing
> > + * at a pt_regs which we can populate as needed.
> > + */
> > +
> > + /* save function arguments */
> > + stp x0, x1, [x9]
> > + stp x2, x3, [x9, #S_X2]
> > + stp x4, x5, [x9, #S_X4]
> > + stp x6, x7, [x9, #S_X6]
> > + stp x8, x9, [x9, #S_X8]
> > +
>
> x9 is not a function argument, and if it were, you'd have clobbered it
> by now. Please use a single 'str' and store x8 only
This way the x9 slot in pt_regs will be undefined. Is that ok with everybody?
> > +ftrace_common_return:
> > + add x9, sp, #16 /* advance to pt_regs for restore */
> > +
> > + ldp x0, x1, [x9]
> > + ldp x2, x3, [x9, #S_X2]
> > + ldp x4, x5, [x9, #S_X4]
> > + ldp x6, x7, [x9, #S_X6]
> > + ldp x8, x9, [x9, #S_X8]
> > +
>
> Same as above. It also deserves a mention that you are relying on the
> absence of IPA-RA, and so x9..x18 are guaranteed to be dead at
> function entry, and so they don't need to be restored here.
Sure, I can quote some ABI spec here :-/
I just wish all arm code was such well documented.
> > --- a/arch/arm64/kernel/ftrace.c
> > +++ b/arch/arm64/kernel/ftrace.c
> > @@ -65,18 +65,61 @@ int ftrace_update_ftrace_func(ftrace_fun
> > return ftrace_modify_code(pc, 0, new, false);
> > }
> >
> > +#ifdef CONFIG_ARM64_MODULE_PLTS
> > +static int install_ftrace_trampoline(struct module *mod, unsigned long *addr)
> > +{
> > + struct plt_entry trampoline, *mod_trampoline;
> > + trampoline = get_plt_entry(*addr);
> > +
> > + if (*addr == FTRACE_ADDR)
> > + mod_trampoline = mod->arch.ftrace_trampoline;
> > + else if (*addr == FTRACE_REGS_ADDR)
> > + mod_trampoline = mod->arch.ftrace_regs_trampoline;
>
> Could we do something like
>
> if (*addr == FTRACE_ADDR)
> mod_trampoline = &mod->arch.ftrace_trampoline[0];
> else if (*addr == FTRACE_REGS_ADDR)
> mod_trampoline = &mod->arch.ftrace_trampoline[1];
>
> and get rid of the additional struct field and pointer?
"0" and "1" won't make it obvious which one has the regs tracing, but besides
that, I like the idea of making this a small array. Other opinions?
Torsten