From: "David A. Long" <[email protected]>
This patchset is heavily based on Sandeepa Prabhu's ARM v8 kprobes patches,
first seen in October 2013. This version attempts to address concerns raised by
reviewers and also fixes problems discovered during testing.
This patchset adds support for kernel probes(kprobes), jump probes(jprobes)
and return probes(kretprobes) support for ARM64.
The kprobes mechanism makes use of software breakpoint and single stepping
support available in the ARM v8 kernel.
Changes since v2 include:
1) Removal of NOP padding in kprobe XOL slots. Slots are now exactly one
instruction long.
2) Disabling of interrupts during execution in single-step mode.
3) Fixing of numerous problems in instruction simulation code (mostly
thanks to Will Cohen).
4) Support for the HAVE_REGS_AND_STACK_ACCESS_API feature is added, to allow
access to kprobes through debugfs.
5) kprobes is *not* enabled in defconfig.
6) Numerous complaints from checkpatch have been cleaned up, although a couple
remain as removing the function pointer typedefs results in ugly code.
Changes since v3 include:
1) Remove table-driven instruction parsing and replace with an if statement
calling out to old and new instruction test functions in insn.c.
2) I removed the addition of orig_x0 to ptrace.h.
3) Reorder the patches.
4) Replace the previous interrupt disabling (from Will Cohen) with
an improved solution (from Steve Capper).
Changes since v4 include:
1) Added insn.c functions to detect exception instructions and DAIF
read/write instructions, and use them to reject probing same.
2) Changed adr detect function to also recognize adrp. Reject both.
3) Added missing __kprobes for some new functions.
4) Added call to kprobes_fault_handler from mm do_page_fault.
5) Reject all non-simulated branch/ret instructions, not just those
that use an immediate offset.
6) Moved software breakpoint definitions into debug-monitors.h.
7) Removed "!XIP_KERNEL" from Kconfig.
8) changed kprobes_condition_check_t and kprobes_prepare_t to probes_*,
for future sharing with uprobes.
9) Removed bogus call to kprobes_restore_local_irqflag() from
trampoline_probe_handler().
Changes since v5 include:
1) Replaced installation of breakpoint hook with direct call from the
handlers in debug-monitors.c, as requested.
2) Reject probing of instructions that read the interrupt mask, in
addition to instructions that set it.
3) Cleaned up comments describing usage of Debug Mask.
4) Added KPROBE_REENTER case in reenter_kprobe.
5) Corrected the ifdef'd definitions for notify_page_fault() to be
consistent when KPROBES is not configed.
6) Changed "cpsr" to "pstate" for HAVE_REGS_AND_STACK_ACCESS_API feature.
7) Added back in missing new files in previous patch.
8) Changed two instances of pr_warning() to pr_warn().
Note that there seems to be at least a potential issue with kprobes
on multiple (possibly all) platforms having to do with use of kfree
inside of the kretprobes trampoline handler. This has manifested
occasionally in systemtap testing on arm64. There does not appear to
be an simple solution to the problem.
Changes since v6 include:
1) New trampoline code from Will Cohen fixes the occasional failure seen
when processing kretprobes by replacing the software breakpoint with
assembly code to implement the return to the original execution stream.
2) Changed ip0, ip1, fp, and lr to plain numbered registers for purposes
of recognizing them as an ascii string in the stack/reg access code.
3) Removed orig_x0.
4) Moved ARM_x* defines from arch/arm64/include/uapi/asm/ptrace.h to
arch/arm64/kernel/ptrace.c.
Changes since v7 include:
1) Move trampoline entry/return code into separate ".S" file instead
of making it a macro in a header file.
2) Add missing register name definitions in asm-offsets.c and use them
in place of hard-coded integer offsets in the trampoline code.
3) Correct the values used to decode MSR immediate instructions, in insn.h.
4) Remove the currently unused simulate_none() function.
Changes since v8 include:
1) Replaced use of REG_OFFSET_NAME with GPR_OFFSET_NAME for numbered
registers.
2) Added an alias for "lr" in the register name lookup table, which perf
tools need to be able to recognize.
3) Changed the code for checking instruction types for probeability and
steppability as per review feedback.
4) Fixed the size of cache being flushed when filling single-step slot.
5) Fixed big-endian issues.
6) Blacklisted copy_to/from_user to avoid aborts while single-stepping.
7) Record conditional instructions that fail the conditional test just
like any other probed (non-conditional) instruction.
8) Removed use of magic number for detecting jprobe return and just
check the breakpoint address instead.
9) Got rid of the unnecessary arch/arm64/kprobes.h.
10) The PSTATE and SP are now properly saved in the kretprobe trampoline
code.
11) This patch no longer depends on the "Consolidate redundant
register/stack access code" patch set.
12) Remove call to fixup_exception from kprobe_fault_handler.
Changes since v9 include:
1) Remove arch/arm/opcodes.c from the arm64 build and move the renamed
arm64_check_condition() function to armv8_deprecated.c. Remove the
asmlinkage.
2) Various other type and style changes suggested by Marc Zyngier.
3) Put back the call to fixup_exception from kprobe_fault_handler.
It proved to be necessary for correct operation.
Changes since v10 include:
1) Rename arm64_check_condition() to arm32_check_condition().
2) Remove redundant define of ARM_OPCODE_CONDITION_UNCOND.
3) Use a accessor functions to read and write registers by number
in the simulation code, to avoid accidentally overriding parts of
the pt_regs structure (e.g.: when the reg is xzr).
4) Remove unused register offset defines.
5) Replace instance of "(void *) 0" with NULL.
6) Rewrite the kretprobe trampoline code using arch/arm64/kvm/hyp/entry.S
as an example. Construct a more complete saved PSTATE in this code.
David A. Long (4):
arm64: Add HAVE_REGS_AND_STACK_ACCESS_API feature
arm64: Add more test functions to insn.c
arm64: add copy_to/from_user to kprobes blacklist
arm64: add conditional instruction simulation support
Sandeepa Prabhu (4):
arm64: Kprobes with single stepping support
arm64: kprobes instruction simulation support
arm64: Add kernel return probes support (kretprobes)
kprobes: Add arm64 case in kprobe example module
William Cohen (1):
arm64: Add trampoline code for kretprobes
arch/arm64/Kconfig | 3 +
arch/arm64/include/asm/debug-monitors.h | 5 +
arch/arm64/include/asm/insn.h | 41 ++
arch/arm64/include/asm/kprobes.h | 62 ++++
arch/arm64/include/asm/probes.h | 45 +++
arch/arm64/include/asm/ptrace.h | 33 +-
arch/arm64/kernel/Makefile | 6 +-
arch/arm64/kernel/armv8_deprecated.c | 19 +-
arch/arm64/kernel/asm-offsets.c | 11 +
arch/arm64/kernel/debug-monitors.c | 18 +-
arch/arm64/kernel/insn.c | 129 +++++++
arch/arm64/kernel/kprobes-arm64.c | 150 ++++++++
arch/arm64/kernel/kprobes-arm64.h | 35 ++
arch/arm64/kernel/kprobes.c | 616 +++++++++++++++++++++++++++++++
arch/arm64/kernel/kprobes_trampoline.S | 88 +++++
arch/arm64/kernel/probes-simulate-insn.c | 218 +++++++++++
arch/arm64/kernel/probes-simulate-insn.h | 28 ++
arch/arm64/kernel/ptrace.c | 117 ++++++
arch/arm64/kernel/vmlinux.lds.S | 1 +
arch/arm64/lib/copy_from_user.S | 1 +
arch/arm64/lib/copy_to_user.S | 1 +
arch/arm64/mm/fault.c | 25 ++
samples/kprobes/kprobe_example.c | 8 +
23 files changed, 1652 insertions(+), 8 deletions(-)
create mode 100644 arch/arm64/include/asm/kprobes.h
create mode 100644 arch/arm64/include/asm/probes.h
create mode 100644 arch/arm64/kernel/kprobes-arm64.c
create mode 100644 arch/arm64/kernel/kprobes-arm64.h
create mode 100644 arch/arm64/kernel/kprobes.c
create mode 100644 arch/arm64/kernel/kprobes_trampoline.S
create mode 100644 arch/arm64/kernel/probes-simulate-insn.c
create mode 100644 arch/arm64/kernel/probes-simulate-insn.h
--
2.5.0
From: "David A. Long" <[email protected]>
Add HAVE_REGS_AND_STACK_ACCESS_API feature for arm64.
Signed-off-by: David A. Long <[email protected]>
---
arch/arm64/Kconfig | 1 +
arch/arm64/include/asm/ptrace.h | 31 +++++++++++
arch/arm64/kernel/ptrace.c | 117 ++++++++++++++++++++++++++++++++++++++++
3 files changed, 149 insertions(+)
diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 8cc6228..4211b0d 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -78,6 +78,7 @@ config ARM64
select HAVE_PERF_EVENTS
select HAVE_PERF_REGS
select HAVE_PERF_USER_STACK_DUMP
+ select HAVE_REGS_AND_STACK_ACCESS_API
select HAVE_RCU_TABLE_FREE
select HAVE_SYSCALL_TRACEPOINTS
select IOMMU_DMA if IOMMU_SUPPORT
diff --git a/arch/arm64/include/asm/ptrace.h b/arch/arm64/include/asm/ptrace.h
index e9e5467..7bd6445 100644
--- a/arch/arm64/include/asm/ptrace.h
+++ b/arch/arm64/include/asm/ptrace.h
@@ -118,6 +118,8 @@ struct pt_regs {
u64 syscallno;
};
+#define MAX_REG_OFFSET offsetof(struct user_pt_regs, pstate)
+
#define arch_has_single_step() (1)
#ifdef CONFIG_COMPAT
@@ -146,6 +148,35 @@ struct pt_regs {
#define user_stack_pointer(regs) \
(!compat_user_mode(regs) ? (regs)->sp : (regs)->compat_sp)
+extern int regs_query_register_offset(const char *name);
+extern const char *regs_query_register_name(unsigned int offset);
+extern bool regs_within_kernel_stack(struct pt_regs *regs, unsigned long addr);
+extern unsigned long regs_get_kernel_stack_nth(struct pt_regs *regs,
+ unsigned int n);
+
+/**
+ * regs_get_register() - get register value from its offset
+ * @regs: pt_regs from which register value is gotten
+ * @offset: offset number of the register.
+ *
+ * regs_get_register returns the value of a register whose offset from @regs.
+ * The @offset is the offset of the register in struct pt_regs.
+ * If @offset is bigger than MAX_REG_OFFSET, this returns 0.
+ */
+static inline u64 regs_get_register(struct pt_regs *regs,
+ unsigned int offset)
+{
+ if (unlikely(offset > MAX_REG_OFFSET))
+ return 0;
+ return *(u64 *)((u64)regs + offset);
+}
+
+/* Valid only for Kernel mode traps. */
+static inline unsigned long kernel_stack_pointer(struct pt_regs *regs)
+{
+ return regs->sp;
+}
+
static inline unsigned long regs_return_value(struct pt_regs *regs)
{
return regs->regs[0];
diff --git a/arch/arm64/kernel/ptrace.c b/arch/arm64/kernel/ptrace.c
index ff7f132..efebf0f 100644
--- a/arch/arm64/kernel/ptrace.c
+++ b/arch/arm64/kernel/ptrace.c
@@ -48,6 +48,123 @@
#define CREATE_TRACE_POINTS
#include <trace/events/syscalls.h>
+struct pt_regs_offset {
+ const char *name;
+ int offset;
+};
+
+#define REG_OFFSET_NAME(r) {.name = #r, .offset = offsetof(struct pt_regs, r)}
+#define REG_OFFSET_END {.name = NULL, .offset = 0}
+#define GPR_OFFSET_NAME(r) \
+ {.name = "x" #r, .offset = offsetof(struct pt_regs, regs[r])}
+
+static const struct pt_regs_offset regoffset_table[] = {
+ GPR_OFFSET_NAME(0),
+ GPR_OFFSET_NAME(1),
+ GPR_OFFSET_NAME(2),
+ GPR_OFFSET_NAME(3),
+ GPR_OFFSET_NAME(4),
+ GPR_OFFSET_NAME(5),
+ GPR_OFFSET_NAME(6),
+ GPR_OFFSET_NAME(7),
+ GPR_OFFSET_NAME(8),
+ GPR_OFFSET_NAME(9),
+ GPR_OFFSET_NAME(10),
+ GPR_OFFSET_NAME(11),
+ GPR_OFFSET_NAME(12),
+ GPR_OFFSET_NAME(13),
+ GPR_OFFSET_NAME(14),
+ GPR_OFFSET_NAME(15),
+ GPR_OFFSET_NAME(16),
+ GPR_OFFSET_NAME(17),
+ GPR_OFFSET_NAME(18),
+ GPR_OFFSET_NAME(19),
+ GPR_OFFSET_NAME(20),
+ GPR_OFFSET_NAME(21),
+ GPR_OFFSET_NAME(22),
+ GPR_OFFSET_NAME(23),
+ GPR_OFFSET_NAME(24),
+ GPR_OFFSET_NAME(25),
+ GPR_OFFSET_NAME(26),
+ GPR_OFFSET_NAME(27),
+ GPR_OFFSET_NAME(28),
+ GPR_OFFSET_NAME(29),
+ GPR_OFFSET_NAME(30),
+ {.name = "lr", .offset = offsetof(struct pt_regs, regs[30])},
+ REG_OFFSET_NAME(sp),
+ REG_OFFSET_NAME(pc),
+ REG_OFFSET_NAME(pstate),
+ REG_OFFSET_END,
+};
+
+/**
+ * regs_query_register_offset() - query register offset from its name
+ * @name: the name of a register
+ *
+ * regs_query_register_offset() returns the offset of a register in struct
+ * pt_regs from its name. If the name is invalid, this returns -EINVAL;
+ */
+int regs_query_register_offset(const char *name)
+{
+ const struct pt_regs_offset *roff;
+
+ for (roff = regoffset_table; roff->name != NULL; roff++)
+ if (!strcmp(roff->name, name))
+ return roff->offset;
+ return -EINVAL;
+}
+
+/**
+ * regs_query_register_name() - query register name from its offset
+ * @offset: the offset of a register in struct pt_regs.
+ *
+ * regs_query_register_name() returns the name of a register from its
+ * offset in struct pt_regs. If the @offset is invalid, this returns NULL;
+ */
+const char *regs_query_register_name(unsigned int offset)
+{
+ const struct pt_regs_offset *roff;
+
+ for (roff = regoffset_table; roff->name != NULL; roff++)
+ if (roff->offset == offset)
+ return roff->name;
+ return NULL;
+}
+
+/**
+ * regs_within_kernel_stack() - check the address in the stack
+ * @regs: pt_regs which contains kernel stack pointer.
+ * @addr: address which is checked.
+ *
+ * regs_within_kernel_stack() checks @addr is within the kernel stack page(s).
+ * If @addr is within the kernel stack, it returns true. If not, returns false.
+ */
+bool regs_within_kernel_stack(struct pt_regs *regs, unsigned long addr)
+{
+ return ((addr & ~(THREAD_SIZE - 1)) ==
+ (kernel_stack_pointer(regs) & ~(THREAD_SIZE - 1)));
+}
+
+/**
+ * regs_get_kernel_stack_nth() - get Nth entry of the stack
+ * @regs: pt_regs which contains kernel stack pointer.
+ * @n: stack entry number.
+ *
+ * regs_get_kernel_stack_nth() returns @n th entry of the kernel stack which
+ * is specified by @regs. If the @n th entry is NOT in the kernel stack,
+ * this returns 0.
+ */
+unsigned long regs_get_kernel_stack_nth(struct pt_regs *regs, unsigned int n)
+{
+ unsigned long *addr = (unsigned long *)kernel_stack_pointer(regs);
+
+ addr += n;
+ if (regs_within_kernel_stack(regs, (unsigned long)addr))
+ return *addr;
+ else
+ return 0;
+}
+
/*
* TODO: does not yet catch signals sent when the child dies.
* in exit.c or in signal.c.
--
2.5.0
From: "David A. Long" <[email protected]>
Certain instructions are hard to execute correctly out-of-line (as in
kprobes). Test functions are added to insn.[hc] to identify these. The
instructions include any that use PC-relative addressing, change the PC,
or change interrupt masking. For efficiency and simplicity test
functions are also added for small collections of related instructions.
Signed-off-by: David A. Long <[email protected]>
---
arch/arm64/include/asm/insn.h | 35 +++++++++++++++++++++++++++++++++++
arch/arm64/kernel/insn.c | 34 ++++++++++++++++++++++++++++++++++
2 files changed, 69 insertions(+)
diff --git a/arch/arm64/include/asm/insn.h b/arch/arm64/include/asm/insn.h
index 30e50eb..662b42a 100644
--- a/arch/arm64/include/asm/insn.h
+++ b/arch/arm64/include/asm/insn.h
@@ -120,6 +120,29 @@ enum aarch64_insn_register {
AARCH64_INSN_REG_SP = 31 /* Stack pointer: as load/store base reg */
};
+enum aarch64_insn_special_register {
+ AARCH64_INSN_SPCLREG_SPSR_EL1 = 0xC200,
+ AARCH64_INSN_SPCLREG_ELR_EL1 = 0xC201,
+ AARCH64_INSN_SPCLREG_SP_EL0 = 0xC208,
+ AARCH64_INSN_SPCLREG_SPSEL = 0xC210,
+ AARCH64_INSN_SPCLREG_CURRENTEL = 0xC212,
+ AARCH64_INSN_SPCLREG_DAIF = 0xDA11,
+ AARCH64_INSN_SPCLREG_NZCV = 0xDA10,
+ AARCH64_INSN_SPCLREG_FPCR = 0xDA20,
+ AARCH64_INSN_SPCLREG_DSPSR_EL0 = 0xDA28,
+ AARCH64_INSN_SPCLREG_DLR_EL0 = 0xDA29,
+ AARCH64_INSN_SPCLREG_SPSR_EL2 = 0xE200,
+ AARCH64_INSN_SPCLREG_ELR_EL2 = 0xE201,
+ AARCH64_INSN_SPCLREG_SP_EL1 = 0xE208,
+ AARCH64_INSN_SPCLREG_SPSR_INQ = 0xE218,
+ AARCH64_INSN_SPCLREG_SPSR_ABT = 0xE219,
+ AARCH64_INSN_SPCLREG_SPSR_UND = 0xE21A,
+ AARCH64_INSN_SPCLREG_SPSR_FIQ = 0xE21B,
+ AARCH64_INSN_SPCLREG_SPSR_EL3 = 0xF200,
+ AARCH64_INSN_SPCLREG_ELR_EL3 = 0xF201,
+ AARCH64_INSN_SPCLREG_SP_EL2 = 0xF210
+};
+
enum aarch64_insn_variant {
AARCH64_INSN_VARIANT_32BIT,
AARCH64_INSN_VARIANT_64BIT
@@ -223,8 +246,13 @@ static __always_inline bool aarch64_insn_is_##abbr(u32 code) \
static __always_inline u32 aarch64_insn_get_##abbr##_value(void) \
{ return (val); }
+__AARCH64_INSN_FUNCS(adr_adrp, 0x1F000000, 0x10000000)
+__AARCH64_INSN_FUNCS(prfm_lit, 0xFF000000, 0xD8000000)
__AARCH64_INSN_FUNCS(str_reg, 0x3FE0EC00, 0x38206800)
__AARCH64_INSN_FUNCS(ldr_reg, 0x3FE0EC00, 0x38606800)
+__AARCH64_INSN_FUNCS(ldr_lit, 0xBF000000, 0x18000000)
+__AARCH64_INSN_FUNCS(ldrsw_lit, 0xFF000000, 0x98000000)
+__AARCH64_INSN_FUNCS(exclusive, 0x3F800000, 0x08000000)
__AARCH64_INSN_FUNCS(stp_post, 0x7FC00000, 0x28800000)
__AARCH64_INSN_FUNCS(ldp_post, 0x7FC00000, 0x28C00000)
__AARCH64_INSN_FUNCS(stp_pre, 0x7FC00000, 0x29800000)
@@ -273,10 +301,14 @@ __AARCH64_INSN_FUNCS(svc, 0xFFE0001F, 0xD4000001)
__AARCH64_INSN_FUNCS(hvc, 0xFFE0001F, 0xD4000002)
__AARCH64_INSN_FUNCS(smc, 0xFFE0001F, 0xD4000003)
__AARCH64_INSN_FUNCS(brk, 0xFFE0001F, 0xD4200000)
+__AARCH64_INSN_FUNCS(exception, 0xFF000000, 0xD4000000)
__AARCH64_INSN_FUNCS(hint, 0xFFFFF01F, 0xD503201F)
__AARCH64_INSN_FUNCS(br, 0xFFFFFC1F, 0xD61F0000)
__AARCH64_INSN_FUNCS(blr, 0xFFFFFC1F, 0xD63F0000)
__AARCH64_INSN_FUNCS(ret, 0xFFFFFC1F, 0xD65F0000)
+__AARCH64_INSN_FUNCS(mrs, 0xFFF00000, 0xD5300000)
+__AARCH64_INSN_FUNCS(msr_imm, 0xFFF8F01F, 0xD500401F)
+__AARCH64_INSN_FUNCS(msr_reg, 0xFFF00000, 0xD5100000)
#undef __AARCH64_INSN_FUNCS
@@ -286,6 +318,8 @@ bool aarch64_insn_is_branch_imm(u32 insn);
int aarch64_insn_read(void *addr, u32 *insnp);
int aarch64_insn_write(void *addr, u32 insn);
enum aarch64_insn_encoding_class aarch64_get_insn_class(u32 insn);
+bool aarch64_insn_uses_literal(u32 insn);
+bool aarch64_insn_is_branch(u32 insn);
u64 aarch64_insn_decode_immediate(enum aarch64_insn_imm_type type, u32 insn);
u32 aarch64_insn_encode_immediate(enum aarch64_insn_imm_type type,
u32 insn, u64 imm);
@@ -367,6 +401,7 @@ bool aarch32_insn_is_wide(u32 insn);
#define A32_RT_OFFSET 12
#define A32_RT2_OFFSET 0
+u32 aarch64_extract_system_register(u32 insn);
u32 aarch32_insn_extract_reg_num(u32 insn, int offset);
u32 aarch32_insn_mcr_extract_opc2(u32 insn);
u32 aarch32_insn_mcr_extract_crm(u32 insn);
diff --git a/arch/arm64/kernel/insn.c b/arch/arm64/kernel/insn.c
index 7371455..60c1c71 100644
--- a/arch/arm64/kernel/insn.c
+++ b/arch/arm64/kernel/insn.c
@@ -162,6 +162,32 @@ static bool __kprobes __aarch64_insn_hotpatch_safe(u32 insn)
aarch64_insn_is_nop(insn);
}
+bool __kprobes aarch64_insn_uses_literal(u32 insn)
+{
+ /* ldr/ldrsw (literal), prfm */
+
+ return aarch64_insn_is_ldr_lit(insn) ||
+ aarch64_insn_is_ldrsw_lit(insn) ||
+ aarch64_insn_is_adr_adrp(insn) ||
+ aarch64_insn_is_prfm_lit(insn);
+}
+
+bool __kprobes aarch64_insn_is_branch(u32 insn)
+{
+ /* b, bl, cb*, tb*, b.cond, br, blr */
+
+ return aarch64_insn_is_b(insn) ||
+ aarch64_insn_is_bl(insn) ||
+ aarch64_insn_is_cbz(insn) ||
+ aarch64_insn_is_cbnz(insn) ||
+ aarch64_insn_is_tbz(insn) ||
+ aarch64_insn_is_tbnz(insn) ||
+ aarch64_insn_is_ret(insn) ||
+ aarch64_insn_is_br(insn) ||
+ aarch64_insn_is_blr(insn) ||
+ aarch64_insn_is_bcond(insn);
+}
+
/*
* ARM Architecture Reference Manual for ARMv8 Profile-A, Issue A.a
* Section B2.6.5 "Concurrent modification and execution of instructions":
@@ -1175,6 +1201,14 @@ u32 aarch64_set_branch_offset(u32 insn, s32 offset)
BUG();
}
+/*
+ * Extract the Op/CR data from a msr/mrs instruction.
+ */
+u32 aarch64_insn_extract_system_reg(u32 insn)
+{
+ return (insn & 0x1FFFE0) >> 5;
+}
+
bool aarch32_insn_is_wide(u32 insn)
{
return insn >= 0xe800;
--
2.5.0
From: "David A. Long" <[email protected]>
Currrently taking exceptions when accessing user data from a kprobe'd
instruction doesn't work. Avoid this situation by blacklisting the relevant
functions.
Signed-off-by: David A. Long <[email protected]>
---
arch/arm64/lib/copy_from_user.S | 1 +
arch/arm64/lib/copy_to_user.S | 1 +
2 files changed, 2 insertions(+)
diff --git a/arch/arm64/lib/copy_from_user.S b/arch/arm64/lib/copy_from_user.S
index 4699cd7..0ac2131 100644
--- a/arch/arm64/lib/copy_from_user.S
+++ b/arch/arm64/lib/copy_from_user.S
@@ -66,6 +66,7 @@
.endm
end .req x5
+ .section .kprobes.text,"ax",%progbits
ENTRY(__copy_from_user)
ALTERNATIVE("nop", __stringify(SET_PSTATE_PAN(0)), ARM64_HAS_PAN, \
CONFIG_ARM64_PAN)
diff --git a/arch/arm64/lib/copy_to_user.S b/arch/arm64/lib/copy_to_user.S
index 7512bbb..e4eb84c 100644
--- a/arch/arm64/lib/copy_to_user.S
+++ b/arch/arm64/lib/copy_to_user.S
@@ -65,6 +65,7 @@
.endm
end .req x5
+ .section .kprobes.text,"ax",%progbits
ENTRY(__copy_to_user)
ALTERNATIVE("nop", __stringify(SET_PSTATE_PAN(0)), ARM64_HAS_PAN, \
CONFIG_ARM64_PAN)
--
2.5.0
From: "David A. Long" <[email protected]>
Cease using the arm32 arm_check_condition() function and replace it with
a local version for use in deprecated instruction support on arm64. Also
make the function table used by this available for future use by kprobes
and/or uprobes.
This function is dervied from code written by Sandeepa Prabhu.
Signed-off-by: Sandeepa Prabhu <[email protected]>
Signed-off-by: David A. Long <[email protected]>
---
arch/arm64/include/asm/insn.h | 3 ++
arch/arm64/kernel/Makefile | 3 +-
arch/arm64/kernel/armv8_deprecated.c | 19 +++++++-
arch/arm64/kernel/insn.c | 94 ++++++++++++++++++++++++++++++++++++
4 files changed, 115 insertions(+), 4 deletions(-)
diff --git a/arch/arm64/include/asm/insn.h b/arch/arm64/include/asm/insn.h
index 662b42a..72dda48 100644
--- a/arch/arm64/include/asm/insn.h
+++ b/arch/arm64/include/asm/insn.h
@@ -405,6 +405,9 @@ u32 aarch64_extract_system_register(u32 insn);
u32 aarch32_insn_extract_reg_num(u32 insn, int offset);
u32 aarch32_insn_mcr_extract_opc2(u32 insn);
u32 aarch32_insn_mcr_extract_crm(u32 insn);
+
+typedef bool (pstate_check_t)(unsigned long);
+extern pstate_check_t * const opcode_condition_checks[16];
#endif /* __ASSEMBLY__ */
#endif /* __ASM_INSN_H */
diff --git a/arch/arm64/kernel/Makefile b/arch/arm64/kernel/Makefile
index 83cd7e6..fd5f163 100644
--- a/arch/arm64/kernel/Makefile
+++ b/arch/arm64/kernel/Makefile
@@ -26,8 +26,7 @@ $(obj)/%.stub.o: $(obj)/%.o FORCE
$(call if_changed,objcopy)
arm64-obj-$(CONFIG_COMPAT) += sys32.o kuser32.o signal32.o \
- sys_compat.o entry32.o \
- ../../arm/kernel/opcodes.o
+ sys_compat.o entry32.o
arm64-obj-$(CONFIG_FUNCTION_TRACER) += ftrace.o entry-ftrace.o
arm64-obj-$(CONFIG_MODULES) += arm64ksyms.o module.o
arm64-obj-$(CONFIG_PERF_EVENTS) += perf_regs.o perf_callchain.o
diff --git a/arch/arm64/kernel/armv8_deprecated.c b/arch/arm64/kernel/armv8_deprecated.c
index 3e01207..c655259 100644
--- a/arch/arm64/kernel/armv8_deprecated.c
+++ b/arch/arm64/kernel/armv8_deprecated.c
@@ -369,6 +369,21 @@ static int emulate_swpX(unsigned int address, unsigned int *data,
return res;
}
+#define ARM_OPCODE_CONDITION_UNCOND 0xf
+
+static unsigned int __kprobes arm32_check_condition(u32 opcode, u32 psr)
+{
+ u32 cc_bits = opcode >> 28;
+
+ if (cc_bits != ARM_OPCODE_CONDITION_UNCOND) {
+ if ((*opcode_condition_checks[cc_bits])(psr))
+ return ARM_OPCODE_CONDTEST_PASS;
+ else
+ return ARM_OPCODE_CONDTEST_FAIL;
+ }
+ return ARM_OPCODE_CONDTEST_UNCOND;
+}
+
/*
* swp_handler logs the id of calling process, dissects the instruction, sanity
* checks the memory location, calls emulate_swpX for the actual operation and
@@ -383,7 +398,7 @@ static int swp_handler(struct pt_regs *regs, u32 instr)
type = instr & TYPE_SWPB;
- switch (arm_check_condition(instr, regs->pstate)) {
+ switch (arm32_check_condition(instr, regs->pstate)) {
case ARM_OPCODE_CONDTEST_PASS:
break;
case ARM_OPCODE_CONDTEST_FAIL:
@@ -464,7 +479,7 @@ static int cp15barrier_handler(struct pt_regs *regs, u32 instr)
{
perf_sw_event(PERF_COUNT_SW_EMULATION_FAULTS, 1, regs, regs->pc);
- switch (arm_check_condition(instr, regs->pstate)) {
+ switch (arm32_check_condition(instr, regs->pstate)) {
case ARM_OPCODE_CONDTEST_PASS:
break;
case ARM_OPCODE_CONDTEST_FAIL:
diff --git a/arch/arm64/kernel/insn.c b/arch/arm64/kernel/insn.c
index 60c1c71..9f15ceb 100644
--- a/arch/arm64/kernel/insn.c
+++ b/arch/arm64/kernel/insn.c
@@ -1234,3 +1234,97 @@ u32 aarch32_insn_mcr_extract_crm(u32 insn)
{
return insn & CRM_MASK;
}
+
+static bool __kprobes __check_eq(unsigned long pstate)
+{
+ return (pstate & PSR_Z_BIT) != 0;
+}
+
+static bool __kprobes __check_ne(unsigned long pstate)
+{
+ return (pstate & PSR_Z_BIT) == 0;
+}
+
+static bool __kprobes __check_cs(unsigned long pstate)
+{
+ return (pstate & PSR_C_BIT) != 0;
+}
+
+static bool __kprobes __check_cc(unsigned long pstate)
+{
+ return (pstate & PSR_C_BIT) == 0;
+}
+
+static bool __kprobes __check_mi(unsigned long pstate)
+{
+ return (pstate & PSR_N_BIT) != 0;
+}
+
+static bool __kprobes __check_pl(unsigned long pstate)
+{
+ return (pstate & PSR_N_BIT) == 0;
+}
+
+static bool __kprobes __check_vs(unsigned long pstate)
+{
+ return (pstate & PSR_V_BIT) != 0;
+}
+
+static bool __kprobes __check_vc(unsigned long pstate)
+{
+ return (pstate & PSR_V_BIT) == 0;
+}
+
+static bool __kprobes __check_hi(unsigned long pstate)
+{
+ pstate &= ~(pstate >> 1); /* PSR_C_BIT &= ~PSR_Z_BIT */
+ return (pstate & PSR_C_BIT) != 0;
+}
+
+static bool __kprobes __check_ls(unsigned long pstate)
+{
+ pstate &= ~(pstate >> 1); /* PSR_C_BIT &= ~PSR_Z_BIT */
+ return (pstate & PSR_C_BIT) == 0;
+}
+
+static bool __kprobes __check_ge(unsigned long pstate)
+{
+ pstate ^= (pstate << 3); /* PSR_N_BIT ^= PSR_V_BIT */
+ return (pstate & PSR_N_BIT) == 0;
+}
+
+static bool __kprobes __check_lt(unsigned long pstate)
+{
+ pstate ^= (pstate << 3); /* PSR_N_BIT ^= PSR_V_BIT */
+ return (pstate & PSR_N_BIT) != 0;
+}
+
+static bool __kprobes __check_gt(unsigned long pstate)
+{
+ /*PSR_N_BIT ^= PSR_V_BIT */
+ unsigned long temp = pstate ^ (pstate << 3);
+
+ temp |= (pstate << 1); /*PSR_N_BIT |= PSR_Z_BIT */
+ return (temp & PSR_N_BIT) == 0;
+}
+
+static bool __kprobes __check_le(unsigned long pstate)
+{
+ /*PSR_N_BIT ^= PSR_V_BIT */
+ unsigned long temp = pstate ^ (pstate << 3);
+
+ temp |= (pstate << 1); /*PSR_N_BIT |= PSR_Z_BIT */
+ return (temp & PSR_N_BIT) != 0;
+}
+
+static bool __kprobes __check_al(unsigned long pstate)
+{
+ return true;
+}
+
+pstate_check_t * const opcode_condition_checks[16] = {
+ __check_eq, __check_ne, __check_cs, __check_cc,
+ __check_mi, __check_pl, __check_vs, __check_vc,
+ __check_hi, __check_ls, __check_ge, __check_lt,
+ __check_gt, __check_le, __check_al, __check_al
+};
--
2.5.0
From: Sandeepa Prabhu <[email protected]>
Kprobes needs simulation of instructions that cannot be stepped
from a different memory location, e.g.: those instructions
that uses PC-relative addressing. In simulation, the behaviour
of the instruction is implemented using a copy of pt_regs.
The following instruction categories are simulated:
- All branching instructions(conditional, register, and immediate)
- Literal access instructions(load-literal, adr/adrp)
Conditional execution is limited to branching instructions in
ARM v8. If conditions at PSTATE do not match the condition fields
of opcode, the instruction is effectively NOP.
Thanks to Will Cohen for assorted suggested changes.
Signed-off-by: Sandeepa Prabhu <[email protected]>
Signed-off-by: William Cohen <[email protected]>
Signed-off-by: David A. Long <[email protected]>
---
arch/arm64/include/asm/insn.h | 1 +
arch/arm64/include/asm/probes.h | 5 +-
arch/arm64/kernel/Makefile | 3 +-
arch/arm64/kernel/insn.c | 1 +
arch/arm64/kernel/kprobes-arm64.c | 29 ++++
arch/arm64/kernel/kprobes.c | 32 ++++-
arch/arm64/kernel/probes-simulate-insn.c | 218 +++++++++++++++++++++++++++++++
arch/arm64/kernel/probes-simulate-insn.h | 28 ++++
8 files changed, 311 insertions(+), 6 deletions(-)
create mode 100644 arch/arm64/kernel/probes-simulate-insn.c
create mode 100644 arch/arm64/kernel/probes-simulate-insn.h
diff --git a/arch/arm64/include/asm/insn.h b/arch/arm64/include/asm/insn.h
index b9567a1..26cee10 100644
--- a/arch/arm64/include/asm/insn.h
+++ b/arch/arm64/include/asm/insn.h
@@ -410,6 +410,7 @@ u32 aarch32_insn_mcr_extract_crm(u32 insn);
typedef bool (pstate_check_t)(unsigned long);
extern pstate_check_t * const opcode_condition_checks[16];
+
#endif /* __ASSEMBLY__ */
#endif /* __ASM_INSN_H */
diff --git a/arch/arm64/include/asm/probes.h b/arch/arm64/include/asm/probes.h
index c5fcbe6..d524f7d 100644
--- a/arch/arm64/include/asm/probes.h
+++ b/arch/arm64/include/asm/probes.h
@@ -15,11 +15,12 @@
#ifndef _ARM_PROBES_H
#define _ARM_PROBES_H
+#include <asm/opcodes.h>
+
struct kprobe;
struct arch_specific_insn;
typedef u32 kprobe_opcode_t;
-typedef unsigned long (kprobes_pstate_check_t)(unsigned long);
typedef void (kprobes_handler_t) (u32 opcode, long addr, struct pt_regs *);
enum pc_restore_type {
@@ -35,7 +36,7 @@ struct kprobe_pc_restore {
/* architecture specific copy of original instruction */
struct arch_specific_insn {
kprobe_opcode_t *insn;
- kprobes_pstate_check_t *pstate_cc;
+ pstate_check_t *pstate_cc;
kprobes_handler_t *handler;
/* restore address after step xol */
struct kprobe_pc_restore restore;
diff --git a/arch/arm64/kernel/Makefile b/arch/arm64/kernel/Makefile
index 4efb791..08325e5 100644
--- a/arch/arm64/kernel/Makefile
+++ b/arch/arm64/kernel/Makefile
@@ -36,7 +36,8 @@ arm64-obj-$(CONFIG_CPU_PM) += sleep.o suspend.o
arm64-obj-$(CONFIG_CPU_IDLE) += cpuidle.o
arm64-obj-$(CONFIG_JUMP_LABEL) += jump_label.o
arm64-obj-$(CONFIG_KGDB) += kgdb.o
-arm64-obj-$(CONFIG_KPROBES) += kprobes.o kprobes-arm64.o
+arm64-obj-$(CONFIG_KPROBES) += kprobes.o kprobes-arm64.o \
+ probes-simulate-insn.o
arm64-obj-$(CONFIG_EFI) += efi.o efi-entry.stub.o
arm64-obj-$(CONFIG_PCI) += pci.o
arm64-obj-$(CONFIG_ARMV8_DEPRECATED) += armv8_deprecated.o
diff --git a/arch/arm64/kernel/insn.c b/arch/arm64/kernel/insn.c
index 9f15ceb..f9a3432 100644
--- a/arch/arm64/kernel/insn.c
+++ b/arch/arm64/kernel/insn.c
@@ -30,6 +30,7 @@
#include <asm/cacheflush.h>
#include <asm/debug-monitors.h>
#include <asm/fixmap.h>
+#include <asm/opcodes.h>
#include <asm/insn.h>
#define AARCH64_INSN_SF_BIT BIT(31)
diff --git a/arch/arm64/kernel/kprobes-arm64.c b/arch/arm64/kernel/kprobes-arm64.c
index e07727a..487238a 100644
--- a/arch/arm64/kernel/kprobes-arm64.c
+++ b/arch/arm64/kernel/kprobes-arm64.c
@@ -21,6 +21,7 @@
#include <asm/sections.h>
#include "kprobes-arm64.h"
+#include "probes-simulate-insn.h"
static bool __kprobes aarch64_insn_is_steppable(u32 insn)
{
@@ -62,8 +63,36 @@ arm_probe_decode_insn(kprobe_opcode_t insn, struct arch_specific_insn *asi)
*/
if (aarch64_insn_is_steppable(insn))
return INSN_GOOD;
+
+ if (aarch64_insn_is_bcond(insn)) {
+ asi->handler = simulate_b_cond;
+ } else if (aarch64_insn_is_cbz(insn) ||
+ aarch64_insn_is_cbnz(insn)) {
+ asi->handler = simulate_cbz_cbnz;
+ } else if (aarch64_insn_is_tbz(insn) ||
+ aarch64_insn_is_tbnz(insn)) {
+ asi->handler = simulate_tbz_tbnz;
+ } else if (aarch64_insn_is_adr_adrp(insn))
+ asi->handler = simulate_adr_adrp;
+ else if (aarch64_insn_is_b(insn) ||
+ aarch64_insn_is_bl(insn))
+ asi->handler = simulate_b_bl;
+ else if (aarch64_insn_is_br(insn) ||
+ aarch64_insn_is_blr(insn) ||
+ aarch64_insn_is_ret(insn))
+ asi->handler = simulate_br_blr_ret;
+ else if (aarch64_insn_is_ldr_lit(insn))
+ asi->handler = simulate_ldr_literal;
+ else if (aarch64_insn_is_ldrsw_lit(insn))
+ asi->handler = simulate_ldrsw_literal;
else
+ /*
+ * Instruction cannot be stepped out-of-line and we don't
+ * (yet) simulate it.
+ */
return INSN_REJECTED;
+
+ return INSN_GOOD_NO_SLOT;
}
static bool __kprobes
diff --git a/arch/arm64/kernel/kprobes.c b/arch/arm64/kernel/kprobes.c
index e72dbce..ffc5affd 100644
--- a/arch/arm64/kernel/kprobes.c
+++ b/arch/arm64/kernel/kprobes.c
@@ -40,6 +40,9 @@ void jprobe_return_break(void);
DEFINE_PER_CPU(struct kprobe *, current_kprobe) = NULL;
DEFINE_PER_CPU(struct kprobe_ctlblk, kprobe_ctlblk);
+static void __kprobes
+post_kprobe_handler(struct kprobe_ctlblk *, struct pt_regs *);
+
static void __kprobes arch_prepare_ss_slot(struct kprobe *p)
{
/* prepare insn slot */
@@ -57,6 +60,24 @@ static void __kprobes arch_prepare_ss_slot(struct kprobe *p)
p->ainsn.restore.type = RESTORE_PC;
}
+static void __kprobes arch_prepare_simulate(struct kprobe *p)
+{
+ /* This instructions is not executed xol. No need to adjust the PC */
+ p->ainsn.restore.addr = 0;
+ p->ainsn.restore.type = NO_RESTORE;
+}
+
+static void __kprobes arch_simulate_insn(struct kprobe *p, struct pt_regs *regs)
+{
+ struct kprobe_ctlblk *kcb = get_kprobe_ctlblk();
+
+ if (p->ainsn.handler)
+ p->ainsn.handler((u32)p->opcode, (long)p->addr, regs);
+
+ /* single step simulated, now go for post processing */
+ post_kprobe_handler(kcb, regs);
+}
+
int __kprobes arch_prepare_kprobe(struct kprobe *p)
{
unsigned long probe_addr = (unsigned long)p->addr;
@@ -73,7 +94,8 @@ int __kprobes arch_prepare_kprobe(struct kprobe *p)
return -EINVAL;
case INSN_GOOD_NO_SLOT: /* insn need simulation */
- return -EINVAL;
+ p->ainsn.insn = NULL;
+ break;
case INSN_GOOD: /* instruction uses slot */
p->ainsn.insn = get_insn_slot();
@@ -83,7 +105,10 @@ int __kprobes arch_prepare_kprobe(struct kprobe *p)
};
/* prepare the instruction */
- arch_prepare_ss_slot(p);
+ if (p->ainsn.insn)
+ arch_prepare_ss_slot(p);
+ else
+ arch_prepare_simulate(p);
return 0;
}
@@ -225,7 +250,8 @@ static void __kprobes setup_singlestep(struct kprobe *p,
kernel_enable_single_step(regs);
instruction_pointer(regs) = slot;
} else {
- BUG();
+ /* insn simulation */
+ arch_simulate_insn(p, regs);
}
}
diff --git a/arch/arm64/kernel/probes-simulate-insn.c b/arch/arm64/kernel/probes-simulate-insn.c
new file mode 100644
index 0000000..94333a6
--- /dev/null
+++ b/arch/arm64/kernel/probes-simulate-insn.c
@@ -0,0 +1,218 @@
+/*
+ * arch/arm64/kernel/probes-simulate-insn.c
+ *
+ * Copyright (C) 2013 Linaro Limited.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * General Public License for more details.
+ */
+
+#include <linux/kernel.h>
+#include <linux/kprobes.h>
+#include <linux/module.h>
+
+#include "probes-simulate-insn.h"
+
+#define sign_extend(x, signbit) \
+ ((x) | (0 - ((x) & (1 << (signbit)))))
+
+#define bbl_displacement(insn) \
+ sign_extend(((insn) & 0x3ffffff) << 2, 27)
+
+#define bcond_displacement(insn) \
+ sign_extend(((insn >> 5) & 0x7ffff) << 2, 20)
+
+#define cbz_displacement(insn) \
+ sign_extend(((insn >> 5) & 0x7ffff) << 2, 20)
+
+#define tbz_displacement(insn) \
+ sign_extend(((insn >> 5) & 0x3fff) << 2, 15)
+
+#define ldr_displacement(insn) \
+ sign_extend(((insn >> 5) & 0x7ffff) << 2, 20)
+
+static inline void set_x_reg(struct pt_regs *regs, int reg, u64 val)
+{
+ if (reg < 31)
+ regs->regs[reg] = val;
+}
+
+static inline void set_w_reg(struct pt_regs *regs, int reg, u64 val)
+{
+ if (reg < 31)
+ *(u32 *) (®s->regs[reg]) = val;
+}
+
+static inline u64 get_x_reg(struct pt_regs *regs, int reg)
+{
+ if (reg < 31)
+ return regs->regs[reg];
+ else
+ return 0;
+}
+
+static inline u32 get_w_reg(struct pt_regs *regs, int reg)
+{
+ if (reg < 31)
+ return regs->regs[reg] & 0xffffffff;
+ else
+ return 0;
+}
+
+static bool __kprobes check_cbz(u32 opcode, struct pt_regs *regs)
+{
+ int xn = opcode & 0x1f;
+
+ return (opcode & (1 << 31)) ?
+ (get_x_reg(regs, xn) == 0) : (get_w_reg(regs, xn) == 0);
+}
+
+static bool __kprobes check_cbnz(u32 opcode, struct pt_regs *regs)
+{
+ int xn = opcode & 0x1f;
+
+ return (opcode & (1 << 31)) ?
+ (get_x_reg(regs, xn) != 0) : (get_w_reg(regs, xn) != 0);
+}
+
+static bool __kprobes check_tbz(u32 opcode, struct pt_regs *regs)
+{
+ int xn = opcode & 0x1f;
+ int bit_pos = ((opcode & (1 << 31)) >> 26) | ((opcode >> 19) & 0x1f);
+
+ return ((get_x_reg(regs, xn) >> bit_pos) & 0x1) == 0;
+}
+
+static bool __kprobes check_tbnz(u32 opcode, struct pt_regs *regs)
+{
+ int xn = opcode & 0x1f;
+ int bit_pos = ((opcode & (1 << 31)) >> 26) | ((opcode >> 19) & 0x1f);
+
+ return ((get_x_reg(regs, xn) >> bit_pos) & 0x1) != 0;
+}
+
+/*
+ * instruction simulation functions
+ */
+void __kprobes
+simulate_adr_adrp(u32 opcode, long addr, struct pt_regs *regs)
+{
+ long imm, xn, val;
+
+ xn = opcode & 0x1f;
+ imm = ((opcode >> 3) & 0x1ffffc) | ((opcode >> 29) & 0x3);
+ imm = sign_extend(imm, 20);
+ if (opcode & 0x80000000)
+ val = (imm<<12) + (addr & 0xfffffffffffff000);
+ else
+ val = imm + addr;
+
+ set_x_reg(regs, xn, val);
+
+ instruction_pointer(regs) += 4;
+}
+
+void __kprobes
+simulate_b_bl(u32 opcode, long addr, struct pt_regs *regs)
+{
+ int disp = bbl_displacement(opcode);
+
+ /* Link register is x30 */
+ if (opcode & (1 << 31))
+ set_x_reg(regs, 30, addr + 4);
+
+ instruction_pointer(regs) = addr + disp;
+}
+
+void __kprobes
+simulate_b_cond(u32 opcode, long addr, struct pt_regs *regs)
+{
+ int disp = 4;
+
+ if (opcode_condition_checks[opcode & 0xf](regs->pstate & 0xffffffff))
+ disp = bcond_displacement(opcode);
+
+ instruction_pointer(regs) = addr + disp;
+}
+
+void __kprobes
+simulate_br_blr_ret(u32 opcode, long addr, struct pt_regs *regs)
+{
+ int xn = (opcode >> 5) & 0x1f;
+
+ /* update pc first in case we're doing a "blr lr" */
+ instruction_pointer(regs) = get_x_reg(regs, xn);
+
+ /* Link register is x30 */
+ if (((opcode >> 21) & 0x3) == 1)
+ set_x_reg(regs, 30, addr + 4);
+}
+
+void __kprobes
+simulate_cbz_cbnz(u32 opcode, long addr, struct pt_regs *regs)
+{
+ int disp = 4;
+
+ if (opcode & (1 << 24)) {
+ if (check_cbnz(opcode, regs))
+ disp = cbz_displacement(opcode);
+ } else {
+ if (check_cbz(opcode, regs))
+ disp = cbz_displacement(opcode);
+ }
+ instruction_pointer(regs) = addr + disp;
+}
+
+void __kprobes
+simulate_tbz_tbnz(u32 opcode, long addr, struct pt_regs *regs)
+{
+ int disp = 4;
+
+ if (opcode & (1 << 24)) {
+ if (check_tbnz(opcode, regs))
+ disp = tbz_displacement(opcode);
+ } else {
+ if (check_tbz(opcode, regs))
+ disp = tbz_displacement(opcode);
+ }
+ instruction_pointer(regs) = addr + disp;
+}
+
+void __kprobes
+simulate_ldr_literal(u32 opcode, long addr, struct pt_regs *regs)
+{
+ u64 *load_addr;
+ int xn = opcode & 0x1f;
+ int disp;
+
+ disp = ldr_displacement(opcode);
+ load_addr = (u64 *) (addr + disp);
+
+ if (opcode & (1 << 30)) /* x0-x30 */
+ set_x_reg(regs, xn, *load_addr);
+ else /* w0-w30 */
+ set_w_reg(regs, xn, (*(u32 *) (load_addr)));
+
+ instruction_pointer(regs) += 4;
+}
+
+void __kprobes
+simulate_ldrsw_literal(u32 opcode, long addr, struct pt_regs *regs)
+{
+ s32 *load_addr;
+ int xn = opcode & 0x1f;
+ int disp;
+
+ disp = ldr_displacement(opcode);
+ load_addr = (s32 *) (addr + disp);
+
+ set_x_reg(regs, xn, *load_addr);
+
+ instruction_pointer(regs) += 4;
+}
diff --git a/arch/arm64/kernel/probes-simulate-insn.h b/arch/arm64/kernel/probes-simulate-insn.h
new file mode 100644
index 0000000..d6bb9a5
--- /dev/null
+++ b/arch/arm64/kernel/probes-simulate-insn.h
@@ -0,0 +1,28 @@
+/*
+ * arch/arm64/kernel/probes-simulate-insn.h
+ *
+ * Copyright (C) 2013 Linaro Limited
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * General Public License for more details.
+ */
+
+#ifndef _ARM_KERNEL_PROBES_SIMULATE_INSN_H
+#define _ARM_KERNEL_PROBES_SIMULATE_INSN_H
+
+void simulate_adr_adrp(u32 opcode, long addr, struct pt_regs *regs);
+void simulate_b_bl(u32 opcode, long addr, struct pt_regs *regs);
+void simulate_b_cond(u32 opcode, long addr, struct pt_regs *regs);
+void simulate_br_blr_ret(u32 opcode, long addr, struct pt_regs *regs);
+void simulate_cbz_cbnz(u32 opcode, long addr, struct pt_regs *regs);
+void simulate_tbz_tbnz(u32 opcode, long addr, struct pt_regs *regs);
+void simulate_ldr_literal(u32 opcode, long addr, struct pt_regs *regs);
+void simulate_ldrsw_literal(u32 opcode, long addr, struct pt_regs *regs);
+
+#endif /* _ARM_KERNEL_PROBES_SIMULATE_INSN_H */
--
2.5.0
From: Sandeepa Prabhu <[email protected]>
Add support for basic kernel probes(kprobes) and jump probes
(jprobes) for ARM64.
Kprobes utilizes software breakpoint and single step debug
exceptions supported on ARM v8.
A software breakpoint is placed at the probe address to trap the
kernel execution into the kprobe handler.
ARM v8 supports enabling single stepping before the break exception
return (ERET), with next PC in exception return address (ELR_EL1). The
kprobe handler prepares an executable memory slot for out-of-line
execution with a copy of the original instruction being probed, and
enables single stepping. The PC is set to the out-of-line slot address
before the ERET. With this scheme, the instruction is executed with the
exact same register context except for the PC (and DAIF) registers.
Debug mask (PSTATE.D) is enabled only when single stepping a recursive
kprobe, e.g.: during kprobes reenter so that probed instruction can be
single stepped within the kprobe handler -exception- context.
The recursion depth of kprobe is always 2, i.e. upon probe re-entry,
any further re-entry is prevented by not calling handlers and the case
counted as a missed kprobe).
Single stepping from the x-o-l slot has a drawback for PC-relative accesses
like branching and symbolic literals access as the offset from the new PC
(slot address) may not be ensured to fit in the immediate value of
the opcode. Such instructions need simulation, so reject
probing them.
Instructions generating exceptions or cpu mode change are rejected
for probing.
Exclusive load/store instructions are rejected too. Additionally, the
code is checked to see if it is inside an exclusive load/store sequence
(code from Pratyush).
System instructions are mostly enabled for stepping, except MSR/MRS
accesses to "DAIF" flags in PSTATE, which are not safe for
probing.
Thanks to Steve Capper and Pratyush Anand for several suggested
Changes.
Signed-off-by: Sandeepa Prabhu <[email protected]>
Signed-off-by: David A. Long <[email protected]>
Signed-off-by: Pratyush Anand <[email protected]>
---
arch/arm64/Kconfig | 1 +
arch/arm64/include/asm/debug-monitors.h | 5 +
arch/arm64/include/asm/insn.h | 4 +-
arch/arm64/include/asm/kprobes.h | 60 ++++
arch/arm64/include/asm/probes.h | 44 +++
arch/arm64/include/asm/ptrace.h | 2 +-
arch/arm64/kernel/Makefile | 1 +
arch/arm64/kernel/debug-monitors.c | 18 +-
arch/arm64/kernel/kprobes-arm64.c | 121 ++++++++
arch/arm64/kernel/kprobes-arm64.h | 35 +++
arch/arm64/kernel/kprobes.c | 512 ++++++++++++++++++++++++++++++++
arch/arm64/kernel/vmlinux.lds.S | 1 +
arch/arm64/mm/fault.c | 25 ++
13 files changed, 824 insertions(+), 5 deletions(-)
create mode 100644 arch/arm64/include/asm/kprobes.h
create mode 100644 arch/arm64/include/asm/probes.h
create mode 100644 arch/arm64/kernel/kprobes-arm64.c
create mode 100644 arch/arm64/kernel/kprobes-arm64.h
create mode 100644 arch/arm64/kernel/kprobes.c
diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 4211b0d..c395386 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -81,6 +81,7 @@ config ARM64
select HAVE_REGS_AND_STACK_ACCESS_API
select HAVE_RCU_TABLE_FREE
select HAVE_SYSCALL_TRACEPOINTS
+ select HAVE_KPROBES
select IOMMU_DMA if IOMMU_SUPPORT
select IRQ_DOMAIN
select IRQ_FORCED_THREADING
diff --git a/arch/arm64/include/asm/debug-monitors.h b/arch/arm64/include/asm/debug-monitors.h
index 279c85b5..274ab60 100644
--- a/arch/arm64/include/asm/debug-monitors.h
+++ b/arch/arm64/include/asm/debug-monitors.h
@@ -78,6 +78,11 @@
#define CACHE_FLUSH_IS_SAFE 1
+/* kprobes BRK opcodes with ESR encoding */
+#define BRK64_ESR_MASK 0xFFFF
+#define BRK64_ESR_KPROBES 0x0004
+#define BRK64_OPCODE_KPROBES (AARCH64_BREAK_MON | (BRK64_ESR_KPROBES << 5))
+
/* AArch32 */
#define DBG_ESR_EVT_BKPT 0x4
#define DBG_ESR_EVT_VECC 0x5
diff --git a/arch/arm64/include/asm/insn.h b/arch/arm64/include/asm/insn.h
index 72dda48..b9567a1 100644
--- a/arch/arm64/include/asm/insn.h
+++ b/arch/arm64/include/asm/insn.h
@@ -253,6 +253,8 @@ __AARCH64_INSN_FUNCS(ldr_reg, 0x3FE0EC00, 0x38606800)
__AARCH64_INSN_FUNCS(ldr_lit, 0xBF000000, 0x18000000)
__AARCH64_INSN_FUNCS(ldrsw_lit, 0xFF000000, 0x98000000)
__AARCH64_INSN_FUNCS(exclusive, 0x3F800000, 0x08000000)
+__AARCH64_INSN_FUNCS(load_ex, 0x3F400000, 0x08400000)
+__AARCH64_INSN_FUNCS(store_ex, 0x3F400000, 0x08000000)
__AARCH64_INSN_FUNCS(stp_post, 0x7FC00000, 0x28800000)
__AARCH64_INSN_FUNCS(ldp_post, 0x7FC00000, 0x28C00000)
__AARCH64_INSN_FUNCS(stp_pre, 0x7FC00000, 0x29800000)
@@ -401,7 +403,7 @@ bool aarch32_insn_is_wide(u32 insn);
#define A32_RT_OFFSET 12
#define A32_RT2_OFFSET 0
-u32 aarch64_extract_system_register(u32 insn);
+u32 aarch64_insn_extract_system_reg(u32 insn);
u32 aarch32_insn_extract_reg_num(u32 insn, int offset);
u32 aarch32_insn_mcr_extract_opc2(u32 insn);
u32 aarch32_insn_mcr_extract_crm(u32 insn);
diff --git a/arch/arm64/include/asm/kprobes.h b/arch/arm64/include/asm/kprobes.h
new file mode 100644
index 0000000..79c9511
--- /dev/null
+++ b/arch/arm64/include/asm/kprobes.h
@@ -0,0 +1,60 @@
+/*
+ * arch/arm64/include/asm/kprobes.h
+ *
+ * Copyright (C) 2013 Linaro Limited
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * General Public License for more details.
+ */
+
+#ifndef _ARM_KPROBES_H
+#define _ARM_KPROBES_H
+
+#include <linux/types.h>
+#include <linux/ptrace.h>
+#include <linux/percpu.h>
+
+#define __ARCH_WANT_KPROBES_INSN_SLOT
+#define MAX_INSN_SIZE 1
+#define MAX_STACK_SIZE 128
+
+#define flush_insn_slot(p) do { } while (0)
+#define kretprobe_blacklist_size 0
+
+#include <asm/probes.h>
+
+struct prev_kprobe {
+ struct kprobe *kp;
+ unsigned int status;
+};
+
+/* Single step context for kprobe */
+struct kprobe_step_ctx {
+ unsigned long ss_pending;
+ unsigned long match_addr;
+};
+
+/* per-cpu kprobe control block */
+struct kprobe_ctlblk {
+ unsigned int kprobe_status;
+ unsigned long saved_irqflag;
+ struct prev_kprobe prev_kprobe;
+ struct kprobe_step_ctx ss_ctx;
+ struct pt_regs jprobe_saved_regs;
+ char jprobes_stack[MAX_STACK_SIZE];
+};
+
+void arch_remove_kprobe(struct kprobe *);
+int kprobe_fault_handler(struct pt_regs *regs, unsigned int fsr);
+int kprobe_exceptions_notify(struct notifier_block *self,
+ unsigned long val, void *data);
+int kprobe_breakpoint_handler(struct pt_regs *regs, unsigned int esr);
+int kprobe_single_step_handler(struct pt_regs *regs, unsigned int esr);
+
+#endif /* _ARM_KPROBES_H */
diff --git a/arch/arm64/include/asm/probes.h b/arch/arm64/include/asm/probes.h
new file mode 100644
index 0000000..c5fcbe6
--- /dev/null
+++ b/arch/arm64/include/asm/probes.h
@@ -0,0 +1,44 @@
+/*
+ * arch/arm64/include/asm/probes.h
+ *
+ * Copyright (C) 2013 Linaro Limited
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * General Public License for more details.
+ */
+#ifndef _ARM_PROBES_H
+#define _ARM_PROBES_H
+
+struct kprobe;
+struct arch_specific_insn;
+
+typedef u32 kprobe_opcode_t;
+typedef unsigned long (kprobes_pstate_check_t)(unsigned long);
+typedef void (kprobes_handler_t) (u32 opcode, long addr, struct pt_regs *);
+
+enum pc_restore_type {
+ NO_RESTORE,
+ RESTORE_PC,
+};
+
+struct kprobe_pc_restore {
+ enum pc_restore_type type;
+ unsigned long addr;
+};
+
+/* architecture specific copy of original instruction */
+struct arch_specific_insn {
+ kprobe_opcode_t *insn;
+ kprobes_pstate_check_t *pstate_cc;
+ kprobes_handler_t *handler;
+ /* restore address after step xol */
+ struct kprobe_pc_restore restore;
+};
+
+#endif
diff --git a/arch/arm64/include/asm/ptrace.h b/arch/arm64/include/asm/ptrace.h
index 7bd6445..88b0a7e 100644
--- a/arch/arm64/include/asm/ptrace.h
+++ b/arch/arm64/include/asm/ptrace.h
@@ -212,7 +212,7 @@ static inline int valid_user_regs(struct user_pt_regs *regs)
return 0;
}
-#define instruction_pointer(regs) ((unsigned long)(regs)->pc)
+#define instruction_pointer(regs) ((regs)->pc)
extern unsigned long profile_pc(struct pt_regs *regs);
diff --git a/arch/arm64/kernel/Makefile b/arch/arm64/kernel/Makefile
index fd5f163..4efb791 100644
--- a/arch/arm64/kernel/Makefile
+++ b/arch/arm64/kernel/Makefile
@@ -36,6 +36,7 @@ arm64-obj-$(CONFIG_CPU_PM) += sleep.o suspend.o
arm64-obj-$(CONFIG_CPU_IDLE) += cpuidle.o
arm64-obj-$(CONFIG_JUMP_LABEL) += jump_label.o
arm64-obj-$(CONFIG_KGDB) += kgdb.o
+arm64-obj-$(CONFIG_KPROBES) += kprobes.o kprobes-arm64.o
arm64-obj-$(CONFIG_EFI) += efi.o efi-entry.stub.o
arm64-obj-$(CONFIG_PCI) += pci.o
arm64-obj-$(CONFIG_ARMV8_DEPRECATED) += armv8_deprecated.o
diff --git a/arch/arm64/kernel/debug-monitors.c b/arch/arm64/kernel/debug-monitors.c
index c536c9e..803fbd6 100644
--- a/arch/arm64/kernel/debug-monitors.c
+++ b/arch/arm64/kernel/debug-monitors.c
@@ -23,6 +23,7 @@
#include <linux/hardirq.h>
#include <linux/init.h>
#include <linux/ptrace.h>
+#include <linux/kprobes.h>
#include <linux/stat.h>
#include <linux/uaccess.h>
@@ -266,10 +267,14 @@ static int single_step_handler(unsigned long addr, unsigned int esr,
*/
user_rewind_single_step(current);
} else {
+#ifdef CONFIG_KPROBES
+ if (kprobe_single_step_handler(regs, esr) == DBG_HOOK_HANDLED)
+ return 0;
+#endif
if (call_step_hook(regs, esr) == DBG_HOOK_HANDLED)
return 0;
- pr_warning("Unexpected kernel single-step exception at EL1\n");
+ pr_warn("Unexpected kernel single-step exception at EL1\n");
/*
* Re-enable stepping since we know that we will be
* returning to regs.
@@ -322,8 +327,15 @@ static int brk_handler(unsigned long addr, unsigned int esr,
{
if (user_mode(regs)) {
send_user_sigtrap(TRAP_BRKPT);
- } else if (call_break_hook(regs, esr) != DBG_HOOK_HANDLED) {
- pr_warning("Unexpected kernel BRK exception at EL1\n");
+ }
+#ifdef CONFIG_KPROBES
+ else if ((esr & BRK64_ESR_MASK) == BRK64_ESR_KPROBES) {
+ if (kprobe_breakpoint_handler(regs, esr) != DBG_HOOK_HANDLED)
+ return -EFAULT;
+ }
+#endif
+ else if (call_break_hook(regs, esr) != DBG_HOOK_HANDLED) {
+ pr_warn("Unexpected kernel BRK exception at EL1\n");
return -EFAULT;
}
diff --git a/arch/arm64/kernel/kprobes-arm64.c b/arch/arm64/kernel/kprobes-arm64.c
new file mode 100644
index 0000000..e07727a
--- /dev/null
+++ b/arch/arm64/kernel/kprobes-arm64.c
@@ -0,0 +1,121 @@
+/*
+ * arch/arm64/kernel/kprobes-arm64.c
+ *
+ * Copyright (C) 2013 Linaro Limited.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * General Public License for more details.
+ */
+
+#include <linux/kernel.h>
+#include <linux/kprobes.h>
+#include <linux/module.h>
+#include <asm/kprobes.h>
+#include <asm/insn.h>
+#include <asm/sections.h>
+
+#include "kprobes-arm64.h"
+
+static bool __kprobes aarch64_insn_is_steppable(u32 insn)
+{
+ if (aarch64_get_insn_class(insn) == AARCH64_INSN_CLS_BR_SYS) {
+ if (aarch64_insn_is_branch(insn) ||
+ aarch64_insn_is_msr_imm(insn) ||
+ aarch64_insn_is_msr_reg(insn) ||
+ aarch64_insn_is_exception(insn))
+ return false;
+
+ if (aarch64_insn_is_mrs(insn))
+ return aarch64_insn_extract_system_reg(insn)
+ != AARCH64_INSN_SPCLREG_DAIF;
+
+ if (aarch64_insn_is_hint(insn))
+ return aarch64_insn_is_nop(insn);
+
+ return true;
+ }
+
+ if (aarch64_insn_uses_literal(insn) ||
+ aarch64_insn_is_exclusive(insn))
+ return false;
+
+ return true;
+}
+
+/* Return:
+ * INSN_REJECTED If instruction is one not allowed to kprobe,
+ * INSN_GOOD If instruction is supported and uses instruction slot,
+ * INSN_GOOD_NO_SLOT If instruction is supported but doesn't use its slot.
+ */
+static enum kprobe_insn __kprobes
+arm_probe_decode_insn(kprobe_opcode_t insn, struct arch_specific_insn *asi)
+{
+ /*
+ * Instructions reading or modifying the PC won't work from the XOL
+ * slot.
+ */
+ if (aarch64_insn_is_steppable(insn))
+ return INSN_GOOD;
+ else
+ return INSN_REJECTED;
+}
+
+static bool __kprobes
+is_probed_address_atomic(kprobe_opcode_t *scan_start, kprobe_opcode_t *scan_end)
+{
+ while (scan_start > scan_end) {
+ /*
+ * atomic region starts from exclusive load and ends with
+ * exclusive store.
+ */
+ if (aarch64_insn_is_store_ex(le32_to_cpu(*scan_start)))
+ return false;
+ else if (aarch64_insn_is_load_ex(le32_to_cpu(*scan_start)))
+ return true;
+ scan_start--;
+ }
+
+ return false;
+}
+
+enum kprobe_insn __kprobes
+arm_kprobe_decode_insn(kprobe_opcode_t *addr, struct arch_specific_insn *asi)
+{
+ enum kprobe_insn decoded;
+ kprobe_opcode_t insn = le32_to_cpu(*addr);
+ kprobe_opcode_t *scan_start = addr - 1;
+ kprobe_opcode_t *scan_end = addr - MAX_ATOMIC_CONTEXT_SIZE;
+#if defined(CONFIG_MODULES) && defined(MODULES_VADDR)
+ struct module *mod;
+#endif
+
+ if (addr >= (kprobe_opcode_t *)_text &&
+ scan_end < (kprobe_opcode_t *)_text)
+ scan_end = (kprobe_opcode_t *)_text;
+#if defined(CONFIG_MODULES) && defined(MODULES_VADDR)
+ else {
+ preempt_disable();
+ mod = __module_address((unsigned long)addr);
+ if (mod && within_module_init((unsigned long)addr, mod) &&
+ !within_module_init((unsigned long)scan_end, mod))
+ scan_end = (kprobe_opcode_t *)mod->init_layout.base;
+ else if (mod && within_module_core((unsigned long)addr, mod) &&
+ !within_module_core((unsigned long)scan_end, mod))
+ scan_end = (kprobe_opcode_t *)mod->core_layout.base;
+ preempt_enable();
+ }
+#endif
+ decoded = arm_probe_decode_insn(insn, asi);
+
+ if (decoded == INSN_REJECTED ||
+ is_probed_address_atomic(scan_start, scan_end))
+ return INSN_REJECTED;
+
+ return decoded;
+}
diff --git a/arch/arm64/kernel/kprobes-arm64.h b/arch/arm64/kernel/kprobes-arm64.h
new file mode 100644
index 0000000..e8378d3
--- /dev/null
+++ b/arch/arm64/kernel/kprobes-arm64.h
@@ -0,0 +1,35 @@
+/*
+ * arch/arm64/kernel/kprobes-arm64.h
+ *
+ * Copyright (C) 2013 Linaro Limited.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * General Public License for more details.
+ */
+
+#ifndef _ARM_KERNEL_KPROBES_ARM64_H
+#define _ARM_KERNEL_KPROBES_ARM64_H
+
+/*
+ * ARM strongly recommends a limit of 128 bytes between LoadExcl and
+ * StoreExcl instructions in a single thread of execution. So keep the
+ * max atomic context size as 32.
+ */
+#define MAX_ATOMIC_CONTEXT_SIZE (128 / sizeof(kprobe_opcode_t))
+
+enum kprobe_insn {
+ INSN_REJECTED,
+ INSN_GOOD_NO_SLOT,
+ INSN_GOOD,
+};
+
+enum kprobe_insn __kprobes
+arm_kprobe_decode_insn(kprobe_opcode_t *addr, struct arch_specific_insn *asi);
+
+#endif /* _ARM_KERNEL_KPROBES_ARM64_H */
diff --git a/arch/arm64/kernel/kprobes.c b/arch/arm64/kernel/kprobes.c
new file mode 100644
index 0000000..e72dbce
--- /dev/null
+++ b/arch/arm64/kernel/kprobes.c
@@ -0,0 +1,512 @@
+/*
+ * arch/arm64/kernel/kprobes.c
+ *
+ * Kprobes support for ARM64
+ *
+ * Copyright (C) 2013 Linaro Limited.
+ * Author: Sandeepa Prabhu <[email protected]>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * General Public License for more details.
+ *
+ */
+#include <linux/kernel.h>
+#include <linux/kprobes.h>
+#include <linux/module.h>
+#include <linux/slab.h>
+#include <linux/stop_machine.h>
+#include <linux/stringify.h>
+#include <asm/traps.h>
+#include <asm/ptrace.h>
+#include <asm/cacheflush.h>
+#include <asm/debug-monitors.h>
+#include <asm/system_misc.h>
+#include <asm/insn.h>
+#include <asm/uaccess.h>
+
+#include "kprobes-arm64.h"
+
+#define MIN_STACK_SIZE(addr) min((unsigned long)MAX_STACK_SIZE, \
+ (unsigned long)current_thread_info() + THREAD_START_SP - (addr))
+
+void jprobe_return_break(void);
+
+DEFINE_PER_CPU(struct kprobe *, current_kprobe) = NULL;
+DEFINE_PER_CPU(struct kprobe_ctlblk, kprobe_ctlblk);
+
+static void __kprobes arch_prepare_ss_slot(struct kprobe *p)
+{
+ /* prepare insn slot */
+ p->ainsn.insn[0] = cpu_to_le32(p->opcode);
+
+ flush_icache_range((uintptr_t) (p->ainsn.insn),
+ (uintptr_t) (p->ainsn.insn) +
+ MAX_INSN_SIZE * sizeof(kprobe_opcode_t));
+
+ /*
+ * Needs restoring of return address after stepping xol.
+ */
+ p->ainsn.restore.addr = (unsigned long) p->addr +
+ sizeof(kprobe_opcode_t);
+ p->ainsn.restore.type = RESTORE_PC;
+}
+
+int __kprobes arch_prepare_kprobe(struct kprobe *p)
+{
+ unsigned long probe_addr = (unsigned long)p->addr;
+
+ /* copy instruction */
+ p->opcode = le32_to_cpu(*p->addr);
+
+ if (in_exception_text(probe_addr))
+ return -EINVAL;
+
+ /* decode instruction */
+ switch (arm_kprobe_decode_insn(p->addr, &p->ainsn)) {
+ case INSN_REJECTED: /* insn not supported */
+ return -EINVAL;
+
+ case INSN_GOOD_NO_SLOT: /* insn need simulation */
+ return -EINVAL;
+
+ case INSN_GOOD: /* instruction uses slot */
+ p->ainsn.insn = get_insn_slot();
+ if (!p->ainsn.insn)
+ return -ENOMEM;
+ break;
+ };
+
+ /* prepare the instruction */
+ arch_prepare_ss_slot(p);
+
+ return 0;
+}
+
+static int __kprobes patch_text(kprobe_opcode_t *addr, u32 opcode)
+{
+ void *addrs[1];
+ u32 insns[1];
+
+ addrs[0] = (void *)addr;
+ insns[0] = (u32)opcode;
+
+ return aarch64_insn_patch_text(addrs, insns, 1);
+}
+
+/* arm kprobe: install breakpoint in text */
+void __kprobes arch_arm_kprobe(struct kprobe *p)
+{
+ patch_text(p->addr, BRK64_OPCODE_KPROBES);
+}
+
+/* disarm kprobe: remove breakpoint from text */
+void __kprobes arch_disarm_kprobe(struct kprobe *p)
+{
+ patch_text(p->addr, p->opcode);
+}
+
+void __kprobes arch_remove_kprobe(struct kprobe *p)
+{
+ if (p->ainsn.insn) {
+ free_insn_slot(p->ainsn.insn, 0);
+ p->ainsn.insn = NULL;
+ }
+}
+
+static void __kprobes save_previous_kprobe(struct kprobe_ctlblk *kcb)
+{
+ kcb->prev_kprobe.kp = kprobe_running();
+ kcb->prev_kprobe.status = kcb->kprobe_status;
+}
+
+static void __kprobes restore_previous_kprobe(struct kprobe_ctlblk *kcb)
+{
+ __this_cpu_write(current_kprobe, kcb->prev_kprobe.kp);
+ kcb->kprobe_status = kcb->prev_kprobe.status;
+}
+
+static void __kprobes set_current_kprobe(struct kprobe *p)
+{
+ __this_cpu_write(current_kprobe, p);
+}
+
+/*
+ * The D-flag (Debug mask) is set (masked) upon deug exception entry.
+ * Kprobes needs to clear (unmask) D-flag -ONLY- in case of recursive
+ * probe i.e. when probe hit from kprobe handler context upon
+ * executing the pre/post handlers. In this case we return with
+ * D-flag clear so that single-stepping can be carried-out.
+ *
+ * Leave D-flag set in all other cases.
+ */
+static void __kprobes
+spsr_set_debug_flag(struct pt_regs *regs, int mask)
+{
+ unsigned long spsr = regs->pstate;
+
+ if (mask)
+ spsr |= PSR_D_BIT;
+ else
+ spsr &= ~PSR_D_BIT;
+
+ regs->pstate = spsr;
+}
+
+/*
+ * Interrupts need to be disabled before single-step mode is set, and not
+ * reenabled until after single-step mode ends.
+ * Without disabling interrupt on local CPU, there is a chance of
+ * interrupt occurrence in the period of exception return and start of
+ * out-of-line single-step, that result in wrongly single stepping
+ * into the interrupt handler.
+ */
+static void __kprobes kprobes_save_local_irqflag(struct pt_regs *regs)
+{
+ struct kprobe_ctlblk *kcb = get_kprobe_ctlblk();
+
+ kcb->saved_irqflag = regs->pstate;
+ regs->pstate |= PSR_I_BIT;
+}
+
+static void __kprobes kprobes_restore_local_irqflag(struct pt_regs *regs)
+{
+ struct kprobe_ctlblk *kcb = get_kprobe_ctlblk();
+
+ if (kcb->saved_irqflag & PSR_I_BIT)
+ regs->pstate |= PSR_I_BIT;
+ else
+ regs->pstate &= ~PSR_I_BIT;
+}
+
+static void __kprobes
+set_ss_context(struct kprobe_ctlblk *kcb, unsigned long addr)
+{
+ kcb->ss_ctx.ss_pending = true;
+ kcb->ss_ctx.match_addr = addr + sizeof(kprobe_opcode_t);
+}
+
+static void __kprobes clear_ss_context(struct kprobe_ctlblk *kcb)
+{
+ kcb->ss_ctx.ss_pending = false;
+ kcb->ss_ctx.match_addr = 0;
+}
+
+static void __kprobes setup_singlestep(struct kprobe *p,
+ struct pt_regs *regs,
+ struct kprobe_ctlblk *kcb, int reenter)
+{
+ unsigned long slot;
+
+ if (reenter) {
+ save_previous_kprobe(kcb);
+ set_current_kprobe(p);
+ kcb->kprobe_status = KPROBE_REENTER;
+ } else {
+ kcb->kprobe_status = KPROBE_HIT_SS;
+ }
+
+ if (p->ainsn.insn) {
+ /* prepare for single stepping */
+ slot = (unsigned long)p->ainsn.insn;
+
+ set_ss_context(kcb, slot); /* mark pending ss */
+
+ if (kcb->kprobe_status == KPROBE_REENTER)
+ spsr_set_debug_flag(regs, 0);
+
+ /* IRQs and single stepping do not mix well. */
+ kprobes_save_local_irqflag(regs);
+ kernel_enable_single_step(regs);
+ instruction_pointer(regs) = slot;
+ } else {
+ BUG();
+ }
+}
+
+static int __kprobes reenter_kprobe(struct kprobe *p,
+ struct pt_regs *regs,
+ struct kprobe_ctlblk *kcb)
+{
+ switch (kcb->kprobe_status) {
+ case KPROBE_HIT_SSDONE:
+ case KPROBE_HIT_ACTIVE:
+ kprobes_inc_nmissed_count(p);
+ setup_singlestep(p, regs, kcb, 1);
+ break;
+ case KPROBE_HIT_SS:
+ case KPROBE_REENTER:
+ pr_warn("Unrecoverable kprobe detected at %p.\n", p->addr);
+ dump_kprobe(p);
+ BUG();
+ break;
+ default:
+ WARN_ON(1);
+ return 0;
+ }
+
+ return 1;
+}
+
+static void __kprobes
+post_kprobe_handler(struct kprobe_ctlblk *kcb, struct pt_regs *regs)
+{
+ struct kprobe *cur = kprobe_running();
+
+ if (!cur)
+ return;
+
+ /* return addr restore if non-branching insn */
+ if (cur->ainsn.restore.type == RESTORE_PC) {
+ instruction_pointer(regs) = cur->ainsn.restore.addr;
+ if (!instruction_pointer(regs))
+ BUG();
+ }
+
+ /* restore back original saved kprobe variables and continue */
+ if (kcb->kprobe_status == KPROBE_REENTER) {
+ restore_previous_kprobe(kcb);
+ return;
+ }
+ /* call post handler */
+ kcb->kprobe_status = KPROBE_HIT_SSDONE;
+ if (cur->post_handler) {
+ /* post_handler can hit breakpoint and single step
+ * again, so we enable D-flag for recursive exception.
+ */
+ cur->post_handler(cur, regs, 0);
+ }
+
+ reset_current_kprobe();
+}
+
+int __kprobes kprobe_fault_handler(struct pt_regs *regs, unsigned int fsr)
+{
+ struct kprobe *cur = kprobe_running();
+ struct kprobe_ctlblk *kcb = get_kprobe_ctlblk();
+
+ switch (kcb->kprobe_status) {
+ case KPROBE_HIT_SS:
+ case KPROBE_REENTER:
+ /*
+ * We are here because the instruction being single
+ * stepped caused a page fault. We reset the current
+ * kprobe and the ip points back to the probe address
+ * and allow the page fault handler to continue as a
+ * normal page fault.
+ */
+ instruction_pointer(regs) = (unsigned long)cur->addr;
+ if (!instruction_pointer(regs))
+ BUG();
+ if (kcb->kprobe_status == KPROBE_REENTER)
+ restore_previous_kprobe(kcb);
+ else
+ reset_current_kprobe();
+
+ break;
+ case KPROBE_HIT_ACTIVE:
+ case KPROBE_HIT_SSDONE:
+ /*
+ * We increment the nmissed count for accounting,
+ * we can also use npre/npostfault count for accounting
+ * these specific fault cases.
+ */
+ kprobes_inc_nmissed_count(cur);
+
+ /*
+ * We come here because instructions in the pre/post
+ * handler caused the page_fault, this could happen
+ * if handler tries to access user space by
+ * copy_from_user(), get_user() etc. Let the
+ * user-specified handler try to fix it first.
+ */
+ if (cur->fault_handler && cur->fault_handler(cur, regs, fsr))
+ return 1;
+
+ /*
+ * In case the user-specified fault handler returned
+ * zero, try to fix up.
+ */
+ if (fixup_exception(regs))
+ return 1;
+ }
+ return 0;
+}
+
+int __kprobes kprobe_exceptions_notify(struct notifier_block *self,
+ unsigned long val, void *data)
+{
+ return NOTIFY_DONE;
+}
+
+static void __kprobes kprobe_handler(struct pt_regs *regs)
+{
+ struct kprobe *p, *cur_kprobe;
+ struct kprobe_ctlblk *kcb;
+ unsigned long addr = instruction_pointer(regs);
+
+ kcb = get_kprobe_ctlblk();
+ cur_kprobe = kprobe_running();
+
+ p = get_kprobe((kprobe_opcode_t *) addr);
+
+ if (p) {
+ if (cur_kprobe) {
+ if (reenter_kprobe(p, regs, kcb))
+ return;
+ } else {
+ /* Probe hit */
+ set_current_kprobe(p);
+ kcb->kprobe_status = KPROBE_HIT_ACTIVE;
+
+ /*
+ * If we have no pre-handler or it returned 0, we
+ * continue with normal processing. If we have a
+ * pre-handler and it returned non-zero, it prepped
+ * for calling the break_handler below on re-entry,
+ * so get out doing nothing more here.
+ *
+ * pre_handler can hit a breakpoint and can step thru
+ * before return, keep PSTATE D-flag enabled until
+ * pre_handler return back.
+ */
+ if (!p->pre_handler || !p->pre_handler(p, regs)) {
+ kcb->kprobe_status = KPROBE_HIT_SS;
+ setup_singlestep(p, regs, kcb, 0);
+ return;
+ }
+ }
+ } else if ((le32_to_cpu(*(kprobe_opcode_t *) addr) ==
+ BRK64_OPCODE_KPROBES) && cur_kprobe) {
+ /* We probably hit a jprobe. Call its break handler. */
+ if (cur_kprobe->break_handler &&
+ cur_kprobe->break_handler(cur_kprobe, regs)) {
+ kcb->kprobe_status = KPROBE_HIT_SS;
+ setup_singlestep(cur_kprobe, regs, kcb, 0);
+ return;
+ }
+ }
+ /*
+ * The breakpoint instruction was removed right
+ * after we hit it. Another cpu has removed
+ * either a probepoint or a debugger breakpoint
+ * at this address. In either case, no further
+ * handling of this interrupt is appropriate.
+ * Return back to original instruction, and continue.
+ */
+}
+
+static int __kprobes
+kprobe_ss_hit(struct kprobe_ctlblk *kcb, unsigned long addr)
+{
+ if ((kcb->ss_ctx.ss_pending)
+ && (kcb->ss_ctx.match_addr == addr)) {
+ clear_ss_context(kcb); /* clear pending ss */
+ return DBG_HOOK_HANDLED;
+ }
+ /* not ours, kprobes should ignore it */
+ return DBG_HOOK_ERROR;
+}
+
+int __kprobes
+kprobe_single_step_handler(struct pt_regs *regs, unsigned int esr)
+{
+ struct kprobe_ctlblk *kcb = get_kprobe_ctlblk();
+ int retval;
+
+ /* return error if this is not our step */
+ retval = kprobe_ss_hit(kcb, instruction_pointer(regs));
+
+ if (retval == DBG_HOOK_HANDLED) {
+ kprobes_restore_local_irqflag(regs);
+ kernel_disable_single_step();
+
+ if (kcb->kprobe_status == KPROBE_REENTER)
+ spsr_set_debug_flag(regs, 1);
+
+ post_kprobe_handler(kcb, regs);
+ }
+
+ return retval;
+}
+
+int __kprobes
+kprobe_breakpoint_handler(struct pt_regs *regs, unsigned int esr)
+{
+ kprobe_handler(regs);
+ return DBG_HOOK_HANDLED;
+}
+
+int __kprobes setjmp_pre_handler(struct kprobe *p, struct pt_regs *regs)
+{
+ struct jprobe *jp = container_of(p, struct jprobe, kp);
+ struct kprobe_ctlblk *kcb = get_kprobe_ctlblk();
+ long stack_ptr = kernel_stack_pointer(regs);
+
+ kcb->jprobe_saved_regs = *regs;
+ memcpy(kcb->jprobes_stack, (void *)stack_ptr,
+ MIN_STACK_SIZE(stack_ptr));
+
+ instruction_pointer(regs) = (long)jp->entry;
+ preempt_disable();
+ return 1;
+}
+
+void __kprobes jprobe_return(void)
+{
+ struct kprobe_ctlblk *kcb = get_kprobe_ctlblk();
+
+ /*
+ * Jprobe handler return by entering break exception,
+ * encoded same as kprobe, but with following conditions
+ * -a magic number in x0 to identify from rest of other kprobes.
+ * -restore stack addr to original saved pt_regs
+ */
+ asm volatile ("ldr x0, [%0]\n\t"
+ "mov sp, x0\n\t"
+ ".globl jprobe_return_break\n\t"
+ "jprobe_return_break:\n\t"
+ "brk %1\n\t"
+ :
+ : "r"(&kcb->jprobe_saved_regs.sp),
+ "I"(BRK64_ESR_KPROBES)
+ : "memory");
+}
+
+int __kprobes longjmp_break_handler(struct kprobe *p, struct pt_regs *regs)
+{
+ struct kprobe_ctlblk *kcb = get_kprobe_ctlblk();
+ long stack_addr = kcb->jprobe_saved_regs.sp;
+ long orig_sp = kernel_stack_pointer(regs);
+ struct jprobe *jp = container_of(p, struct jprobe, kp);
+
+ if (instruction_pointer(regs) != (u64) jprobe_return_break)
+ return 0;
+
+ if (orig_sp != stack_addr) {
+ struct pt_regs *saved_regs =
+ (struct pt_regs *)kcb->jprobe_saved_regs.sp;
+ pr_err("current sp %lx does not match saved sp %lx\n",
+ orig_sp, stack_addr);
+ pr_err("Saved registers for jprobe %p\n", jp);
+ show_regs(saved_regs);
+ pr_err("Current registers\n");
+ show_regs(regs);
+ BUG();
+ }
+ *regs = kcb->jprobe_saved_regs;
+ memcpy((void *)stack_addr, kcb->jprobes_stack,
+ MIN_STACK_SIZE(stack_addr));
+ preempt_enable_no_resched();
+ return 1;
+}
+
+int __init arch_init_kprobes(void)
+{
+ return 0;
+}
diff --git a/arch/arm64/kernel/vmlinux.lds.S b/arch/arm64/kernel/vmlinux.lds.S
index e3928f5..5a163a6 100644
--- a/arch/arm64/kernel/vmlinux.lds.S
+++ b/arch/arm64/kernel/vmlinux.lds.S
@@ -105,6 +105,7 @@ SECTIONS
TEXT_TEXT
SCHED_TEXT
LOCK_TEXT
+ KPROBES_TEXT
HYPERVISOR_TEXT
IDMAP_TEXT
*(.fixup)
diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c
index abe2a95..bf9d451 100644
--- a/arch/arm64/mm/fault.c
+++ b/arch/arm64/mm/fault.c
@@ -41,6 +41,28 @@
static const char *fault_name(unsigned int esr);
+#ifdef CONFIG_KPROBES
+static inline int notify_page_fault(struct pt_regs *regs, unsigned int esr)
+{
+ int ret = 0;
+
+ /* kprobe_running() needs smp_processor_id() */
+ if (!user_mode(regs)) {
+ preempt_disable();
+ if (kprobe_running() && kprobe_fault_handler(regs, esr))
+ ret = 1;
+ preempt_enable();
+ }
+
+ return ret;
+}
+#else
+static inline int notify_page_fault(struct pt_regs *regs, unsigned int esr)
+{
+ return 0;
+}
+#endif
+
/*
* Dump out the page tables associated with 'addr' in mm 'mm'.
*/
@@ -201,6 +223,9 @@ static int __kprobes do_page_fault(unsigned long addr, unsigned int esr,
unsigned long vm_flags = VM_READ | VM_WRITE | VM_EXEC;
unsigned int mm_flags = FAULT_FLAG_ALLOW_RETRY | FAULT_FLAG_KILLABLE;
+ if (notify_page_fault(regs, esr))
+ return 0;
+
tsk = current;
mm = tsk->mm;
--
2.5.0
From: William Cohen <[email protected]>
The trampoline code is used by kretprobes to capture a return from a probed
function. This is done by saving the registers, calling the handler, and
restoring the registers. The code then returns to the original saved caller
return address. It is necessary to do this directly instead of using a
software breakpoint because the code used in processing that breakpoint
could itself be kprobe'd and cause a problematic reentry into the debug
exception handler.
Signed-off-by: William Cohen <[email protected]>
Signed-off-by: David A. Long <[email protected]>
---
arch/arm64/include/asm/kprobes.h | 2 +
arch/arm64/kernel/Makefile | 1 +
arch/arm64/kernel/asm-offsets.c | 11 +++++
arch/arm64/kernel/kprobes.c | 5 ++
arch/arm64/kernel/kprobes_trampoline.S | 88 ++++++++++++++++++++++++++++++++++
5 files changed, 107 insertions(+)
create mode 100644 arch/arm64/kernel/kprobes_trampoline.S
diff --git a/arch/arm64/include/asm/kprobes.h b/arch/arm64/include/asm/kprobes.h
index 79c9511..61b4915 100644
--- a/arch/arm64/include/asm/kprobes.h
+++ b/arch/arm64/include/asm/kprobes.h
@@ -56,5 +56,7 @@ int kprobe_exceptions_notify(struct notifier_block *self,
unsigned long val, void *data);
int kprobe_breakpoint_handler(struct pt_regs *regs, unsigned int esr);
int kprobe_single_step_handler(struct pt_regs *regs, unsigned int esr);
+void kretprobe_trampoline(void);
+void __kprobes *trampoline_probe_handler(struct pt_regs *regs);
#endif /* _ARM_KPROBES_H */
diff --git a/arch/arm64/kernel/Makefile b/arch/arm64/kernel/Makefile
index 08325e5..f192b7d 100644
--- a/arch/arm64/kernel/Makefile
+++ b/arch/arm64/kernel/Makefile
@@ -37,6 +37,7 @@ arm64-obj-$(CONFIG_CPU_IDLE) += cpuidle.o
arm64-obj-$(CONFIG_JUMP_LABEL) += jump_label.o
arm64-obj-$(CONFIG_KGDB) += kgdb.o
arm64-obj-$(CONFIG_KPROBES) += kprobes.o kprobes-arm64.o \
+ kprobes_trampoline.o \
probes-simulate-insn.o
arm64-obj-$(CONFIG_EFI) += efi.o efi-entry.stub.o
arm64-obj-$(CONFIG_PCI) += pci.o
diff --git a/arch/arm64/kernel/asm-offsets.c b/arch/arm64/kernel/asm-offsets.c
index fffa4ac6..f7cc8ce 100644
--- a/arch/arm64/kernel/asm-offsets.c
+++ b/arch/arm64/kernel/asm-offsets.c
@@ -50,6 +50,17 @@ int main(void)
DEFINE(S_X5, offsetof(struct pt_regs, regs[5]));
DEFINE(S_X6, offsetof(struct pt_regs, regs[6]));
DEFINE(S_X7, offsetof(struct pt_regs, regs[7]));
+ DEFINE(S_X8, offsetof(struct pt_regs, regs[8]));
+ DEFINE(S_X10, offsetof(struct pt_regs, regs[10]));
+ DEFINE(S_X12, offsetof(struct pt_regs, regs[12]));
+ DEFINE(S_X14, offsetof(struct pt_regs, regs[14]));
+ DEFINE(S_X16, offsetof(struct pt_regs, regs[16]));
+ DEFINE(S_X18, offsetof(struct pt_regs, regs[18]));
+ DEFINE(S_X20, offsetof(struct pt_regs, regs[20]));
+ DEFINE(S_X22, offsetof(struct pt_regs, regs[22]));
+ DEFINE(S_X24, offsetof(struct pt_regs, regs[24]));
+ DEFINE(S_X26, offsetof(struct pt_regs, regs[26]));
+ DEFINE(S_X28, offsetof(struct pt_regs, regs[28]));
DEFINE(S_LR, offsetof(struct pt_regs, regs[30]));
DEFINE(S_SP, offsetof(struct pt_regs, sp));
#ifdef CONFIG_COMPAT
diff --git a/arch/arm64/kernel/kprobes.c b/arch/arm64/kernel/kprobes.c
index ffc5affd..bd3f233 100644
--- a/arch/arm64/kernel/kprobes.c
+++ b/arch/arm64/kernel/kprobes.c
@@ -532,6 +532,11 @@ int __kprobes longjmp_break_handler(struct kprobe *p, struct pt_regs *regs)
return 1;
}
+void __kprobes __used *trampoline_probe_handler(struct pt_regs *regs)
+{
+ return NULL;
+}
+
int __init arch_init_kprobes(void)
{
return 0;
diff --git a/arch/arm64/kernel/kprobes_trampoline.S b/arch/arm64/kernel/kprobes_trampoline.S
new file mode 100644
index 0000000..072b4e5
--- /dev/null
+++ b/arch/arm64/kernel/kprobes_trampoline.S
@@ -0,0 +1,88 @@
+/*
+ * trampoline entry and return code for kretprobes.
+ */
+
+#include <linux/linkage.h>
+#include <asm/asm-offsets.h>
+#include <asm/assembler.h>
+
+ .text
+
+.macro save_all_base_regs ctxt
+ stp x0, x1, [\ctxt, #S_X0]
+ stp x2, x3, [\ctxt, #S_X2]
+ stp x4, x5, [\ctxt, #S_X4]
+ stp x6, x7, [\ctxt, #S_X6]
+ stp x8, x9, [\ctxt, #S_X8]
+ stp x10, x11, [\ctxt, #S_X10]
+ stp x12, x13, [\ctxt, #S_X12]
+ stp x14, x15, [\ctxt, #S_X14]
+ stp x16, x17, [\ctxt, #S_X16]
+ stp x18, x19, [\ctxt, #S_X18]
+ stp x20, x21, [\ctxt, #S_X20]
+ stp x22, x23, [\ctxt, #S_X22]
+ stp x24, x25, [\ctxt, #S_X24]
+ stp x26, x27, [\ctxt, #S_X26]
+ stp x28, x29, [\ctxt, #S_X28]
+ str lr, [\ctxt, #S_LR]
+ add x0, \ctxt, #S_FRAME_SIZE
+ str x0, [\ctxt, #S_SP]
+/*
+ * Construct a useful saved PSTATE
+ */
+ mrs x0, nzcv
+ and x0, x0, #0xf0000000
+ mrs x1, daif
+ and x1, x1, #0x3c0
+ orr x0, x0, x1
+ mrs x1, CurrentEL
+ and x1, x1, #12
+ lsl x1, x1, #21
+ orr x0, x1, x0
+ mrs x1, SPSel
+ and x1, x1, #1
+ lsl x1, x1, #21
+ orr x0, x1, x0
+ str x0, [\ctxt, #S_PSTATE]
+.endm
+
+.macro restore_all_base_regs ctxt
+ ldr x0, [\ctxt, #S_PSTATE]
+ and x0, x0, #0xf0000000
+ msr nzcv, x0
+ ldp x0, x1, [\ctxt, #S_X0]
+ ldp x2, x3, [\ctxt, #S_X2]
+ ldp x4, x5, [\ctxt, #S_X4]
+ ldp x6, x7, [\ctxt, #S_X6]
+ ldp x8, x9, [\ctxt, #S_X8]
+ ldp x10, x11, [\ctxt, #S_X10]
+ ldp x12, x13, [\ctxt, #S_X12]
+ ldp x14, x15, [\ctxt, #S_X14]
+ ldp x16, x17, [\ctxt, #S_X16]
+ ldp x18, x19, [\ctxt, #S_X18]
+ ldp x20, x21, [\ctxt, #S_X20]
+ ldp x22, x23, [\ctxt, #S_X22]
+ ldp x24, x25, [\ctxt, #S_X24]
+ ldp x26, x27, [\ctxt, #S_X26]
+ ldp x28, x29, [\ctxt, #S_X28]
+.endm
+
+ENTRY(kretprobe_trampoline)
+
+ sub sp, sp, #S_FRAME_SIZE
+
+ save_all_base_regs sp
+
+ mov x0, sp
+ bl trampoline_probe_handler
+ /* Replace trampoline address in lr with actual
+ orig_ret_addr return address. */
+ mov lr, x0
+
+ restore_all_base_regs sp
+
+ add sp, sp, #S_FRAME_SIZE
+
+ ret
+
+ENDPROC(kretprobe_trampoline)
--
2.5.0
From: Sandeepa Prabhu <[email protected]>
Add info prints in sample kprobe handlers for ARM64
Signed-off-by: Sandeepa Prabhu <[email protected]>
---
samples/kprobes/kprobe_example.c | 8 ++++++++
1 file changed, 8 insertions(+)
diff --git a/samples/kprobes/kprobe_example.c b/samples/kprobes/kprobe_example.c
index 727eb21..0c72b8a 100644
--- a/samples/kprobes/kprobe_example.c
+++ b/samples/kprobes/kprobe_example.c
@@ -42,6 +42,10 @@ static int handler_pre(struct kprobe *p, struct pt_regs *regs)
" ex1 = 0x%lx\n",
p->addr, regs->pc, regs->ex1);
#endif
+#ifdef CONFIG_ARM64
+ pr_info("pre_handler: p->addr = 0x%p, pc = 0x%lx\n",
+ p->addr, (long)regs->pc);
+#endif
/* A dump_stack() here will give a stack backtrace */
return 0;
@@ -67,6 +71,10 @@ static void handler_post(struct kprobe *p, struct pt_regs *regs,
printk(KERN_INFO "post_handler: p->addr = 0x%p, ex1 = 0x%lx\n",
p->addr, regs->ex1);
#endif
+#ifdef CONFIG_ARM64
+ pr_info("post_handler: p->addr = 0x%p, pc = 0x%lx\n",
+ p->addr, (long)regs->pc);
+#endif
}
/*
--
2.5.0
From: Sandeepa Prabhu <[email protected]>
The pre-handler of this special 'trampoline' kprobe executes the return
probe handler functions and restores original return address in ELR_EL1.
This way the saved pt_regs still hold the original register context to be
carried back to the probed kernel function.
Signed-off-by: Sandeepa Prabhu <[email protected]>
Signed-off-by: David A. Long <[email protected]>
---
arch/arm64/Kconfig | 1 +
arch/arm64/kernel/kprobes.c | 75 ++++++++++++++++++++++++++++++++++++++++++++-
2 files changed, 75 insertions(+), 1 deletion(-)
diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index c395386..72412de 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -82,6 +82,7 @@ config ARM64
select HAVE_RCU_TABLE_FREE
select HAVE_SYSCALL_TRACEPOINTS
select HAVE_KPROBES
+ select HAVE_KRETPROBES if HAVE_KPROBES
select IOMMU_DMA if IOMMU_SUPPORT
select IRQ_DOMAIN
select IRQ_FORCED_THREADING
diff --git a/arch/arm64/kernel/kprobes.c b/arch/arm64/kernel/kprobes.c
index bd3f233..13d3333 100644
--- a/arch/arm64/kernel/kprobes.c
+++ b/arch/arm64/kernel/kprobes.c
@@ -534,7 +534,80 @@ int __kprobes longjmp_break_handler(struct kprobe *p, struct pt_regs *regs)
void __kprobes __used *trampoline_probe_handler(struct pt_regs *regs)
{
- return NULL;
+ struct kretprobe_instance *ri = NULL;
+ struct hlist_head *head, empty_rp;
+ struct hlist_node *tmp;
+ unsigned long flags, orig_ret_addr = 0;
+ unsigned long trampoline_address =
+ (unsigned long)&kretprobe_trampoline;
+
+ INIT_HLIST_HEAD(&empty_rp);
+ kretprobe_hash_lock(current, &head, &flags);
+
+ /*
+ * It is possible to have multiple instances associated with a given
+ * task either because multiple functions in the call path have
+ * a return probe installed on them, and/or more than one return
+ * probe was registered for a target function.
+ *
+ * We can handle this because:
+ * - instances are always inserted at the head of the list
+ * - when multiple return probes are registered for the same
+ * function, the first instance's ret_addr will point to the
+ * real return address, and all the rest will point to
+ * kretprobe_trampoline
+ */
+ hlist_for_each_entry_safe(ri, tmp, head, hlist) {
+ if (ri->task != current)
+ /* another task is sharing our hash bucket */
+ continue;
+
+ if (ri->rp && ri->rp->handler) {
+ __this_cpu_write(current_kprobe, &ri->rp->kp);
+ get_kprobe_ctlblk()->kprobe_status = KPROBE_HIT_ACTIVE;
+ ri->rp->handler(ri, regs);
+ __this_cpu_write(current_kprobe, NULL);
+ }
+
+ orig_ret_addr = (unsigned long)ri->ret_addr;
+ recycle_rp_inst(ri, &empty_rp);
+
+ if (orig_ret_addr != trampoline_address)
+ /*
+ * This is the real return address. Any other
+ * instances associated with this task are for
+ * other calls deeper on the call stack
+ */
+ break;
+ }
+
+ kretprobe_assert(ri, orig_ret_addr, trampoline_address);
+ /* restore the original return address */
+ instruction_pointer(regs) = orig_ret_addr;
+ reset_current_kprobe();
+ kretprobe_hash_unlock(current, &flags);
+
+ hlist_for_each_entry_safe(ri, tmp, &empty_rp, hlist) {
+ hlist_del(&ri->hlist);
+ kfree(ri);
+ }
+
+ /* return 1 so that post handlers not called */
+ return (void *) orig_ret_addr;
+}
+
+void __kprobes arch_prepare_kretprobe(struct kretprobe_instance *ri,
+ struct pt_regs *regs)
+{
+ ri->ret_addr = (kprobe_opcode_t *)regs->regs[30];
+
+ /* replace return addr (x30) with trampoline */
+ regs->regs[30] = (long)&kretprobe_trampoline;
+}
+
+int __kprobes arch_trampoline_kprobe(struct kprobe *p)
+{
+ return 0;
}
int __init arch_init_kprobes(void)
--
2.5.0
Hi David,
On 09/03/16 05:32, David Long wrote:
> From: "David A. Long" <[email protected]>
>
> Add HAVE_REGS_AND_STACK_ACCESS_API feature for arm64.
>
> Signed-off-by: David A. Long <[email protected]>
> diff --git a/arch/arm64/kernel/ptrace.c b/arch/arm64/kernel/ptrace.c
> index ff7f132..efebf0f 100644
> --- a/arch/arm64/kernel/ptrace.c
> +++ b/arch/arm64/kernel/ptrace.c
[ ... SNIP ... ]
> +/**
> + * regs_within_kernel_stack() - check the address in the stack
> + * @regs: pt_regs which contains kernel stack pointer.
> + * @addr: address which is checked.
> + *
> + * regs_within_kernel_stack() checks @addr is within the kernel stack page(s).
> + * If @addr is within the kernel stack, it returns true. If not, returns false.
> + */
> +bool regs_within_kernel_stack(struct pt_regs *regs, unsigned long addr)
> +{
> + return ((addr & ~(THREAD_SIZE - 1)) ==
> + (kernel_stack_pointer(regs) & ~(THREAD_SIZE - 1)));
I'm not sure where this is called from, but if kernel_stack_pointer(regs) could
ever point into an irq_stack you will get the wrong result.
arch/arm64/include/asm/irq.h has 'on_irq_stack(sp, cpu)' which should help,
although you will need to check the bounds of the irq_stack separately.
The horrible details...
>From arch/arm64/kernel/irq.c:20
> /* irq stack only needs to be 16 byte aligned - not IRQ_STACK_SIZE aligned. */
> DEFINE_PER_CPU(unsigned long [IRQ_STACK_SIZE/sizeof(long)], irq_stack)
> __aligned(16);
This was because per-cpu variables can be at-most page aligned.
6cdf9c7ca687 ("arm64: Store struct thread_info in sp_el0") changed
current_thread_info() to work on these weirdly aligned irq_stacks.
Thanks,
James
On Wed, 9 Mar 2016 00:32:20 -0500
David Long <[email protected]> wrote:
David,
> From: Sandeepa Prabhu <[email protected]>
>
> Kprobes needs simulation of instructions that cannot be stepped
> from a different memory location, e.g.: those instructions
> that uses PC-relative addressing. In simulation, the behaviour
> of the instruction is implemented using a copy of pt_regs.
>
> The following instruction categories are simulated:
> - All branching instructions(conditional, register, and immediate)
> - Literal access instructions(load-literal, adr/adrp)
>
> Conditional execution is limited to branching instructions in
> ARM v8. If conditions at PSTATE do not match the condition fields
> of opcode, the instruction is effectively NOP.
>
> Thanks to Will Cohen for assorted suggested changes.
>
> Signed-off-by: Sandeepa Prabhu <[email protected]>
> Signed-off-by: William Cohen <[email protected]>
> Signed-off-by: David A. Long <[email protected]>
> ---
> arch/arm64/include/asm/insn.h | 1 +
> arch/arm64/include/asm/probes.h | 5 +-
> arch/arm64/kernel/Makefile | 3 +-
> arch/arm64/kernel/insn.c | 1 +
> arch/arm64/kernel/kprobes-arm64.c | 29 ++++
> arch/arm64/kernel/kprobes.c | 32 ++++-
> arch/arm64/kernel/probes-simulate-insn.c | 218 +++++++++++++++++++++++++++++++
> arch/arm64/kernel/probes-simulate-insn.h | 28 ++++
> 8 files changed, 311 insertions(+), 6 deletions(-)
> create mode 100644 arch/arm64/kernel/probes-simulate-insn.c
> create mode 100644 arch/arm64/kernel/probes-simulate-insn.h
>
> diff --git a/arch/arm64/include/asm/insn.h b/arch/arm64/include/asm/insn.h
> index b9567a1..26cee10 100644
> --- a/arch/arm64/include/asm/insn.h
> +++ b/arch/arm64/include/asm/insn.h
> @@ -410,6 +410,7 @@ u32 aarch32_insn_mcr_extract_crm(u32 insn);
>
> typedef bool (pstate_check_t)(unsigned long);
> extern pstate_check_t * const opcode_condition_checks[16];
> +
> #endif /* __ASSEMBLY__ */
>
> #endif /* __ASM_INSN_H */
> diff --git a/arch/arm64/include/asm/probes.h b/arch/arm64/include/asm/probes.h
> index c5fcbe6..d524f7d 100644
> --- a/arch/arm64/include/asm/probes.h
> +++ b/arch/arm64/include/asm/probes.h
> @@ -15,11 +15,12 @@
> #ifndef _ARM_PROBES_H
> #define _ARM_PROBES_H
>
> +#include <asm/opcodes.h>
> +
> struct kprobe;
> struct arch_specific_insn;
>
> typedef u32 kprobe_opcode_t;
> -typedef unsigned long (kprobes_pstate_check_t)(unsigned long);
> typedef void (kprobes_handler_t) (u32 opcode, long addr, struct pt_regs *);
>
> enum pc_restore_type {
> @@ -35,7 +36,7 @@ struct kprobe_pc_restore {
> /* architecture specific copy of original instruction */
> struct arch_specific_insn {
> kprobe_opcode_t *insn;
> - kprobes_pstate_check_t *pstate_cc;
> + pstate_check_t *pstate_cc;
> kprobes_handler_t *handler;
> /* restore address after step xol */
> struct kprobe_pc_restore restore;
> diff --git a/arch/arm64/kernel/Makefile b/arch/arm64/kernel/Makefile
> index 4efb791..08325e5 100644
> --- a/arch/arm64/kernel/Makefile
> +++ b/arch/arm64/kernel/Makefile
> @@ -36,7 +36,8 @@ arm64-obj-$(CONFIG_CPU_PM) += sleep.o suspend.o
> arm64-obj-$(CONFIG_CPU_IDLE) += cpuidle.o
> arm64-obj-$(CONFIG_JUMP_LABEL) += jump_label.o
> arm64-obj-$(CONFIG_KGDB) += kgdb.o
> -arm64-obj-$(CONFIG_KPROBES) += kprobes.o kprobes-arm64.o
> +arm64-obj-$(CONFIG_KPROBES) += kprobes.o kprobes-arm64.o \
> + probes-simulate-insn.o
> arm64-obj-$(CONFIG_EFI) += efi.o efi-entry.stub.o
> arm64-obj-$(CONFIG_PCI) += pci.o
> arm64-obj-$(CONFIG_ARMV8_DEPRECATED) += armv8_deprecated.o
> diff --git a/arch/arm64/kernel/insn.c b/arch/arm64/kernel/insn.c
> index 9f15ceb..f9a3432 100644
> --- a/arch/arm64/kernel/insn.c
> +++ b/arch/arm64/kernel/insn.c
> @@ -30,6 +30,7 @@
> #include <asm/cacheflush.h>
> #include <asm/debug-monitors.h>
> #include <asm/fixmap.h>
> +#include <asm/opcodes.h>
> #include <asm/insn.h>
>
> #define AARCH64_INSN_SF_BIT BIT(31)
> diff --git a/arch/arm64/kernel/kprobes-arm64.c b/arch/arm64/kernel/kprobes-arm64.c
> index e07727a..487238a 100644
> --- a/arch/arm64/kernel/kprobes-arm64.c
> +++ b/arch/arm64/kernel/kprobes-arm64.c
> @@ -21,6 +21,7 @@
> #include <asm/sections.h>
>
> #include "kprobes-arm64.h"
> +#include "probes-simulate-insn.h"
>
> static bool __kprobes aarch64_insn_is_steppable(u32 insn)
> {
> @@ -62,8 +63,36 @@ arm_probe_decode_insn(kprobe_opcode_t insn, struct arch_specific_insn *asi)
> */
> if (aarch64_insn_is_steppable(insn))
> return INSN_GOOD;
> +
> + if (aarch64_insn_is_bcond(insn)) {
> + asi->handler = simulate_b_cond;
> + } else if (aarch64_insn_is_cbz(insn) ||
> + aarch64_insn_is_cbnz(insn)) {
> + asi->handler = simulate_cbz_cbnz;
> + } else if (aarch64_insn_is_tbz(insn) ||
> + aarch64_insn_is_tbnz(insn)) {
> + asi->handler = simulate_tbz_tbnz;
> + } else if (aarch64_insn_is_adr_adrp(insn))
> + asi->handler = simulate_adr_adrp;
> + else if (aarch64_insn_is_b(insn) ||
> + aarch64_insn_is_bl(insn))
> + asi->handler = simulate_b_bl;
> + else if (aarch64_insn_is_br(insn) ||
> + aarch64_insn_is_blr(insn) ||
> + aarch64_insn_is_ret(insn))
> + asi->handler = simulate_br_blr_ret;
> + else if (aarch64_insn_is_ldr_lit(insn))
> + asi->handler = simulate_ldr_literal;
> + else if (aarch64_insn_is_ldrsw_lit(insn))
> + asi->handler = simulate_ldrsw_literal;
> else
> + /*
> + * Instruction cannot be stepped out-of-line and we don't
> + * (yet) simulate it.
> + */
> return INSN_REJECTED;
> +
> + return INSN_GOOD_NO_SLOT;
> }
>
> static bool __kprobes
> diff --git a/arch/arm64/kernel/kprobes.c b/arch/arm64/kernel/kprobes.c
> index e72dbce..ffc5affd 100644
> --- a/arch/arm64/kernel/kprobes.c
> +++ b/arch/arm64/kernel/kprobes.c
> @@ -40,6 +40,9 @@ void jprobe_return_break(void);
> DEFINE_PER_CPU(struct kprobe *, current_kprobe) = NULL;
> DEFINE_PER_CPU(struct kprobe_ctlblk, kprobe_ctlblk);
>
> +static void __kprobes
> +post_kprobe_handler(struct kprobe_ctlblk *, struct pt_regs *);
> +
> static void __kprobes arch_prepare_ss_slot(struct kprobe *p)
> {
> /* prepare insn slot */
> @@ -57,6 +60,24 @@ static void __kprobes arch_prepare_ss_slot(struct kprobe *p)
> p->ainsn.restore.type = RESTORE_PC;
> }
>
> +static void __kprobes arch_prepare_simulate(struct kprobe *p)
> +{
> + /* This instructions is not executed xol. No need to adjust the PC */
> + p->ainsn.restore.addr = 0;
> + p->ainsn.restore.type = NO_RESTORE;
> +}
> +
> +static void __kprobes arch_simulate_insn(struct kprobe *p, struct pt_regs *regs)
> +{
> + struct kprobe_ctlblk *kcb = get_kprobe_ctlblk();
> +
> + if (p->ainsn.handler)
> + p->ainsn.handler((u32)p->opcode, (long)p->addr, regs);
> +
> + /* single step simulated, now go for post processing */
> + post_kprobe_handler(kcb, regs);
> +}
> +
> int __kprobes arch_prepare_kprobe(struct kprobe *p)
> {
> unsigned long probe_addr = (unsigned long)p->addr;
> @@ -73,7 +94,8 @@ int __kprobes arch_prepare_kprobe(struct kprobe *p)
> return -EINVAL;
>
> case INSN_GOOD_NO_SLOT: /* insn need simulation */
> - return -EINVAL;
> + p->ainsn.insn = NULL;
> + break;
>
> case INSN_GOOD: /* instruction uses slot */
> p->ainsn.insn = get_insn_slot();
> @@ -83,7 +105,10 @@ int __kprobes arch_prepare_kprobe(struct kprobe *p)
> };
>
> /* prepare the instruction */
> - arch_prepare_ss_slot(p);
> + if (p->ainsn.insn)
> + arch_prepare_ss_slot(p);
> + else
> + arch_prepare_simulate(p);
>
> return 0;
> }
> @@ -225,7 +250,8 @@ static void __kprobes setup_singlestep(struct kprobe *p,
> kernel_enable_single_step(regs);
> instruction_pointer(regs) = slot;
> } else {
> - BUG();
> + /* insn simulation */
> + arch_simulate_insn(p, regs);
> }
> }
>
> diff --git a/arch/arm64/kernel/probes-simulate-insn.c b/arch/arm64/kernel/probes-simulate-insn.c
> new file mode 100644
> index 0000000..94333a6
> --- /dev/null
> +++ b/arch/arm64/kernel/probes-simulate-insn.c
> @@ -0,0 +1,218 @@
> +/*
> + * arch/arm64/kernel/probes-simulate-insn.c
> + *
> + * Copyright (C) 2013 Linaro Limited.
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License version 2 as
> + * published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
> + * General Public License for more details.
> + */
> +
> +#include <linux/kernel.h>
> +#include <linux/kprobes.h>
> +#include <linux/module.h>
> +
> +#include "probes-simulate-insn.h"
> +
> +#define sign_extend(x, signbit) \
> + ((x) | (0 - ((x) & (1 << (signbit)))))
> +
> +#define bbl_displacement(insn) \
> + sign_extend(((insn) & 0x3ffffff) << 2, 27)
> +
> +#define bcond_displacement(insn) \
> + sign_extend(((insn >> 5) & 0x7ffff) << 2, 20)
> +
> +#define cbz_displacement(insn) \
> + sign_extend(((insn >> 5) & 0x7ffff) << 2, 20)
> +
> +#define tbz_displacement(insn) \
> + sign_extend(((insn >> 5) & 0x3fff) << 2, 15)
> +
> +#define ldr_displacement(insn) \
> + sign_extend(((insn >> 5) & 0x7ffff) << 2, 20)
> +
> +static inline void set_x_reg(struct pt_regs *regs, int reg, u64 val)
> +{
> + if (reg < 31)
> + regs->regs[reg] = val;
> +}
> +
> +static inline void set_w_reg(struct pt_regs *regs, int reg, u64 val)
> +{
> + if (reg < 31)
> + *(u32 *) (®s->regs[reg]) = val;
I'm afraid this is subtly buggy. A "ldr w0, =value" will write the
entire register, clearing the top 32 bits. Here, you're only writing
the bottom 32bits (not to mention that this looks completely broken on
BE).
A much better way of writing this would be:
regs->regs[reg] = lower_32_bit(val);
> +}
> +
> +static inline u64 get_x_reg(struct pt_regs *regs, int reg)
> +{
> + if (reg < 31)
> + return regs->regs[reg];
> + else
> + return 0;
> +}
> +
> +static inline u32 get_w_reg(struct pt_regs *regs, int reg)
> +{
> + if (reg < 31)
> + return regs->regs[reg] & 0xffffffff;
return lower_32_bit(regs->regs[reg]);
> + else
> + return 0;
> +}
> +
> +static bool __kprobes check_cbz(u32 opcode, struct pt_regs *regs)
> +{
> + int xn = opcode & 0x1f;
> +
> + return (opcode & (1 << 31)) ?
> + (get_x_reg(regs, xn) == 0) : (get_w_reg(regs, xn) == 0);
> +}
> +
> +static bool __kprobes check_cbnz(u32 opcode, struct pt_regs *regs)
> +{
> + int xn = opcode & 0x1f;
> +
> + return (opcode & (1 << 31)) ?
> + (get_x_reg(regs, xn) != 0) : (get_w_reg(regs, xn) != 0);
> +}
> +
> +static bool __kprobes check_tbz(u32 opcode, struct pt_regs *regs)
> +{
> + int xn = opcode & 0x1f;
> + int bit_pos = ((opcode & (1 << 31)) >> 26) | ((opcode >> 19) & 0x1f);
> +
> + return ((get_x_reg(regs, xn) >> bit_pos) & 0x1) == 0;
> +}
> +
> +static bool __kprobes check_tbnz(u32 opcode, struct pt_regs *regs)
> +{
> + int xn = opcode & 0x1f;
> + int bit_pos = ((opcode & (1 << 31)) >> 26) | ((opcode >> 19) & 0x1f);
> +
> + return ((get_x_reg(regs, xn) >> bit_pos) & 0x1) != 0;
> +}
> +
> +/*
> + * instruction simulation functions
> + */
> +void __kprobes
> +simulate_adr_adrp(u32 opcode, long addr, struct pt_regs *regs)
> +{
> + long imm, xn, val;
> +
> + xn = opcode & 0x1f;
> + imm = ((opcode >> 3) & 0x1ffffc) | ((opcode >> 29) & 0x3);
> + imm = sign_extend(imm, 20);
> + if (opcode & 0x80000000)
> + val = (imm<<12) + (addr & 0xfffffffffffff000);
> + else
> + val = imm + addr;
> +
> + set_x_reg(regs, xn, val);
> +
> + instruction_pointer(regs) += 4;
> +}
> +
> +void __kprobes
> +simulate_b_bl(u32 opcode, long addr, struct pt_regs *regs)
> +{
> + int disp = bbl_displacement(opcode);
> +
> + /* Link register is x30 */
> + if (opcode & (1 << 31))
> + set_x_reg(regs, 30, addr + 4);
> +
> + instruction_pointer(regs) = addr + disp;
> +}
> +
> +void __kprobes
> +simulate_b_cond(u32 opcode, long addr, struct pt_regs *regs)
> +{
> + int disp = 4;
> +
> + if (opcode_condition_checks[opcode & 0xf](regs->pstate & 0xffffffff))
> + disp = bcond_displacement(opcode);
> +
> + instruction_pointer(regs) = addr + disp;
> +}
> +
> +void __kprobes
> +simulate_br_blr_ret(u32 opcode, long addr, struct pt_regs *regs)
> +{
> + int xn = (opcode >> 5) & 0x1f;
> +
> + /* update pc first in case we're doing a "blr lr" */
> + instruction_pointer(regs) = get_x_reg(regs, xn);
> +
> + /* Link register is x30 */
> + if (((opcode >> 21) & 0x3) == 1)
> + set_x_reg(regs, 30, addr + 4);
> +}
> +
> +void __kprobes
> +simulate_cbz_cbnz(u32 opcode, long addr, struct pt_regs *regs)
> +{
> + int disp = 4;
> +
> + if (opcode & (1 << 24)) {
> + if (check_cbnz(opcode, regs))
> + disp = cbz_displacement(opcode);
> + } else {
> + if (check_cbz(opcode, regs))
> + disp = cbz_displacement(opcode);
> + }
> + instruction_pointer(regs) = addr + disp;
> +}
> +
> +void __kprobes
> +simulate_tbz_tbnz(u32 opcode, long addr, struct pt_regs *regs)
> +{
> + int disp = 4;
> +
> + if (opcode & (1 << 24)) {
> + if (check_tbnz(opcode, regs))
> + disp = tbz_displacement(opcode);
> + } else {
> + if (check_tbz(opcode, regs))
> + disp = tbz_displacement(opcode);
> + }
> + instruction_pointer(regs) = addr + disp;
> +}
> +
> +void __kprobes
> +simulate_ldr_literal(u32 opcode, long addr, struct pt_regs *regs)
> +{
> + u64 *load_addr;
> + int xn = opcode & 0x1f;
> + int disp;
> +
> + disp = ldr_displacement(opcode);
> + load_addr = (u64 *) (addr + disp);
> +
> + if (opcode & (1 << 30)) /* x0-x30 */
> + set_x_reg(regs, xn, *load_addr);
> + else /* w0-w30 */
> + set_w_reg(regs, xn, (*(u32 *) (load_addr)));
If you're passing a u32 to set_w_reg(), why is the prototype taking a
u64?
> +
> + instruction_pointer(regs) += 4;
> +}
> +
> +void __kprobes
> +simulate_ldrsw_literal(u32 opcode, long addr, struct pt_regs *regs)
> +{
> + s32 *load_addr;
> + int xn = opcode & 0x1f;
> + int disp;
> +
> + disp = ldr_displacement(opcode);
> + load_addr = (s32 *) (addr + disp);
> +
> + set_x_reg(regs, xn, *load_addr);
> +
> + instruction_pointer(regs) += 4;
> +}
> diff --git a/arch/arm64/kernel/probes-simulate-insn.h b/arch/arm64/kernel/probes-simulate-insn.h
> new file mode 100644
> index 0000000..d6bb9a5
> --- /dev/null
> +++ b/arch/arm64/kernel/probes-simulate-insn.h
> @@ -0,0 +1,28 @@
> +/*
> + * arch/arm64/kernel/probes-simulate-insn.h
> + *
> + * Copyright (C) 2013 Linaro Limited
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License version 2 as
> + * published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
> + * General Public License for more details.
> + */
> +
> +#ifndef _ARM_KERNEL_PROBES_SIMULATE_INSN_H
> +#define _ARM_KERNEL_PROBES_SIMULATE_INSN_H
> +
> +void simulate_adr_adrp(u32 opcode, long addr, struct pt_regs *regs);
> +void simulate_b_bl(u32 opcode, long addr, struct pt_regs *regs);
> +void simulate_b_cond(u32 opcode, long addr, struct pt_regs *regs);
> +void simulate_br_blr_ret(u32 opcode, long addr, struct pt_regs *regs);
> +void simulate_cbz_cbnz(u32 opcode, long addr, struct pt_regs *regs);
> +void simulate_tbz_tbnz(u32 opcode, long addr, struct pt_regs *regs);
> +void simulate_ldr_literal(u32 opcode, long addr, struct pt_regs *regs);
> +void simulate_ldrsw_literal(u32 opcode, long addr, struct pt_regs *regs);
> +
> +#endif /* _ARM_KERNEL_PROBES_SIMULATE_INSN_H */
Thanks,
M.
--
Jazz is not dead. It just smells funny.
On Wed, 9 Mar 2016 00:32:18 -0500
David Long <[email protected]> wrote:
> From: "David A. Long" <[email protected]>
>
> Cease using the arm32 arm_check_condition() function and replace it with
> a local version for use in deprecated instruction support on arm64. Also
> make the function table used by this available for future use by kprobes
> and/or uprobes.
>
> This function is dervied from code written by Sandeepa Prabhu.
>
> Signed-off-by: Sandeepa Prabhu <[email protected]>
> Signed-off-by: David A. Long <[email protected]>
> ---
> arch/arm64/include/asm/insn.h | 3 ++
> arch/arm64/kernel/Makefile | 3 +-
> arch/arm64/kernel/armv8_deprecated.c | 19 +++++++-
> arch/arm64/kernel/insn.c | 94 ++++++++++++++++++++++++++++++++++++
> 4 files changed, 115 insertions(+), 4 deletions(-)
>
> diff --git a/arch/arm64/include/asm/insn.h b/arch/arm64/include/asm/insn.h
> index 662b42a..72dda48 100644
> --- a/arch/arm64/include/asm/insn.h
> +++ b/arch/arm64/include/asm/insn.h
> @@ -405,6 +405,9 @@ u32 aarch64_extract_system_register(u32 insn);
> u32 aarch32_insn_extract_reg_num(u32 insn, int offset);
> u32 aarch32_insn_mcr_extract_opc2(u32 insn);
> u32 aarch32_insn_mcr_extract_crm(u32 insn);
> +
> +typedef bool (pstate_check_t)(unsigned long);
> +extern pstate_check_t * const opcode_condition_checks[16];
> #endif /* __ASSEMBLY__ */
>
> #endif /* __ASM_INSN_H */
> diff --git a/arch/arm64/kernel/Makefile b/arch/arm64/kernel/Makefile
> index 83cd7e6..fd5f163 100644
> --- a/arch/arm64/kernel/Makefile
> +++ b/arch/arm64/kernel/Makefile
> @@ -26,8 +26,7 @@ $(obj)/%.stub.o: $(obj)/%.o FORCE
> $(call if_changed,objcopy)
>
> arm64-obj-$(CONFIG_COMPAT) += sys32.o kuser32.o signal32.o \
> - sys_compat.o entry32.o \
> - ../../arm/kernel/opcodes.o
> + sys_compat.o entry32.o
> arm64-obj-$(CONFIG_FUNCTION_TRACER) += ftrace.o entry-ftrace.o
> arm64-obj-$(CONFIG_MODULES) += arm64ksyms.o module.o
> arm64-obj-$(CONFIG_PERF_EVENTS) += perf_regs.o perf_callchain.o
> diff --git a/arch/arm64/kernel/armv8_deprecated.c b/arch/arm64/kernel/armv8_deprecated.c
> index 3e01207..c655259 100644
> --- a/arch/arm64/kernel/armv8_deprecated.c
> +++ b/arch/arm64/kernel/armv8_deprecated.c
> @@ -369,6 +369,21 @@ static int emulate_swpX(unsigned int address, unsigned int *data,
> return res;
> }
>
> +#define ARM_OPCODE_CONDITION_UNCOND 0xf
> +
> +static unsigned int __kprobes arm32_check_condition(u32 opcode, u32 psr)
> +{
> + u32 cc_bits = opcode >> 28;
> +
> + if (cc_bits != ARM_OPCODE_CONDITION_UNCOND) {
> + if ((*opcode_condition_checks[cc_bits])(psr))
> + return ARM_OPCODE_CONDTEST_PASS;
> + else
> + return ARM_OPCODE_CONDTEST_FAIL;
> + }
> + return ARM_OPCODE_CONDTEST_UNCOND;
> +}
> +
> /*
> * swp_handler logs the id of calling process, dissects the instruction, sanity
> * checks the memory location, calls emulate_swpX for the actual operation and
> @@ -383,7 +398,7 @@ static int swp_handler(struct pt_regs *regs, u32 instr)
>
> type = instr & TYPE_SWPB;
>
> - switch (arm_check_condition(instr, regs->pstate)) {
> + switch (arm32_check_condition(instr, regs->pstate)) {
> case ARM_OPCODE_CONDTEST_PASS:
> break;
> case ARM_OPCODE_CONDTEST_FAIL:
> @@ -464,7 +479,7 @@ static int cp15barrier_handler(struct pt_regs *regs, u32 instr)
> {
> perf_sw_event(PERF_COUNT_SW_EMULATION_FAULTS, 1, regs, regs->pc);
>
> - switch (arm_check_condition(instr, regs->pstate)) {
> + switch (arm32_check_condition(instr, regs->pstate)) {
> case ARM_OPCODE_CONDTEST_PASS:
> break;
> case ARM_OPCODE_CONDTEST_FAIL:
> diff --git a/arch/arm64/kernel/insn.c b/arch/arm64/kernel/insn.c
> index 60c1c71..9f15ceb 100644
> --- a/arch/arm64/kernel/insn.c
> +++ b/arch/arm64/kernel/insn.c
> @@ -1234,3 +1234,97 @@ u32 aarch32_insn_mcr_extract_crm(u32 insn)
> {
> return insn & CRM_MASK;
> }
> +
> +static bool __kprobes __check_eq(unsigned long pstate)
> +{
> + return (pstate & PSR_Z_BIT) != 0;
> +}
> +
> +static bool __kprobes __check_ne(unsigned long pstate)
> +{
> + return (pstate & PSR_Z_BIT) == 0;
> +}
> +
> +static bool __kprobes __check_cs(unsigned long pstate)
> +{
> + return (pstate & PSR_C_BIT) != 0;
> +}
> +
> +static bool __kprobes __check_cc(unsigned long pstate)
> +{
> + return (pstate & PSR_C_BIT) == 0;
> +}
> +
> +static bool __kprobes __check_mi(unsigned long pstate)
> +{
> + return (pstate & PSR_N_BIT) != 0;
> +}
> +
> +static bool __kprobes __check_pl(unsigned long pstate)
> +{
> + return (pstate & PSR_N_BIT) == 0;
> +}
> +
> +static bool __kprobes __check_vs(unsigned long pstate)
> +{
> + return (pstate & PSR_V_BIT) != 0;
> +}
> +
> +static bool __kprobes __check_vc(unsigned long pstate)
> +{
> + return (pstate & PSR_V_BIT) == 0;
> +}
> +
> +static bool __kprobes __check_hi(unsigned long pstate)
> +{
> + pstate &= ~(pstate >> 1); /* PSR_C_BIT &= ~PSR_Z_BIT */
> + return (pstate & PSR_C_BIT) != 0;
> +}
> +
> +static bool __kprobes __check_ls(unsigned long pstate)
> +{
> + pstate &= ~(pstate >> 1); /* PSR_C_BIT &= ~PSR_Z_BIT */
> + return (pstate & PSR_C_BIT) == 0;
> +}
> +
> +static bool __kprobes __check_ge(unsigned long pstate)
> +{
> + pstate ^= (pstate << 3); /* PSR_N_BIT ^= PSR_V_BIT */
> + return (pstate & PSR_N_BIT) == 0;
> +}
> +
> +static bool __kprobes __check_lt(unsigned long pstate)
> +{
> + pstate ^= (pstate << 3); /* PSR_N_BIT ^= PSR_V_BIT */
> + return (pstate & PSR_N_BIT) != 0;
> +}
> +
> +static bool __kprobes __check_gt(unsigned long pstate)
> +{
> + /*PSR_N_BIT ^= PSR_V_BIT */
> + unsigned long temp = pstate ^ (pstate << 3);
> +
> + temp |= (pstate << 1); /*PSR_N_BIT |= PSR_Z_BIT */
> + return (temp & PSR_N_BIT) == 0;
> +}
> +
> +static bool __kprobes __check_le(unsigned long pstate)
> +{
> + /*PSR_N_BIT ^= PSR_V_BIT */
> + unsigned long temp = pstate ^ (pstate << 3);
> +
> + temp |= (pstate << 1); /*PSR_N_BIT |= PSR_Z_BIT */
> + return (temp & PSR_N_BIT) != 0;
> +}
> +
> +static bool __kprobes __check_al(unsigned long pstate)
> +{
> + return true;
> +}
> +
> +pstate_check_t * const opcode_condition_checks[16] = {
> + __check_eq, __check_ne, __check_cs, __check_cc,
> + __check_mi, __check_pl, __check_vs, __check_vc,
> + __check_hi, __check_ls, __check_ge, __check_lt,
> + __check_gt, __check_le, __check_al, __check_al
The very last entry seems wrong, or is at least the opposite of what
the current code has. It should be something called __check_nv(), and
always return false (condition code NEVER).
> +};
Thanks,
M.
--
Jazz is not dead. It just smells funny.
On Wed, 9 Mar 2016 00:32:21 -0500
David Long <[email protected]> wrote:
David,
I remember looking at that code over your shoulder whilst at Connect
last week, but I clearly wasn't running on all cylinders, because there
is a few gotchas here - see below.
> From: William Cohen <[email protected]>
>
> The trampoline code is used by kretprobes to capture a return from a probed
> function. This is done by saving the registers, calling the handler, and
> restoring the registers. The code then returns to the original saved caller
> return address. It is necessary to do this directly instead of using a
> software breakpoint because the code used in processing that breakpoint
> could itself be kprobe'd and cause a problematic reentry into the debug
> exception handler.
>
> Signed-off-by: William Cohen <[email protected]>
> Signed-off-by: David A. Long <[email protected]>
> ---
> arch/arm64/include/asm/kprobes.h | 2 +
> arch/arm64/kernel/Makefile | 1 +
> arch/arm64/kernel/asm-offsets.c | 11 +++++
> arch/arm64/kernel/kprobes.c | 5 ++
> arch/arm64/kernel/kprobes_trampoline.S | 88 ++++++++++++++++++++++++++++++++++
> 5 files changed, 107 insertions(+)
> create mode 100644 arch/arm64/kernel/kprobes_trampoline.S
>
> diff --git a/arch/arm64/include/asm/kprobes.h b/arch/arm64/include/asm/kprobes.h
> index 79c9511..61b4915 100644
> --- a/arch/arm64/include/asm/kprobes.h
> +++ b/arch/arm64/include/asm/kprobes.h
> @@ -56,5 +56,7 @@ int kprobe_exceptions_notify(struct notifier_block *self,
> unsigned long val, void *data);
> int kprobe_breakpoint_handler(struct pt_regs *regs, unsigned int esr);
> int kprobe_single_step_handler(struct pt_regs *regs, unsigned int esr);
> +void kretprobe_trampoline(void);
> +void __kprobes *trampoline_probe_handler(struct pt_regs *regs);
>
> #endif /* _ARM_KPROBES_H */
> diff --git a/arch/arm64/kernel/Makefile b/arch/arm64/kernel/Makefile
> index 08325e5..f192b7d 100644
> --- a/arch/arm64/kernel/Makefile
> +++ b/arch/arm64/kernel/Makefile
> @@ -37,6 +37,7 @@ arm64-obj-$(CONFIG_CPU_IDLE) += cpuidle.o
> arm64-obj-$(CONFIG_JUMP_LABEL) += jump_label.o
> arm64-obj-$(CONFIG_KGDB) += kgdb.o
> arm64-obj-$(CONFIG_KPROBES) += kprobes.o kprobes-arm64.o \
> + kprobes_trampoline.o \
> probes-simulate-insn.o
> arm64-obj-$(CONFIG_EFI) += efi.o efi-entry.stub.o
> arm64-obj-$(CONFIG_PCI) += pci.o
> diff --git a/arch/arm64/kernel/asm-offsets.c b/arch/arm64/kernel/asm-offsets.c
> index fffa4ac6..f7cc8ce 100644
> --- a/arch/arm64/kernel/asm-offsets.c
> +++ b/arch/arm64/kernel/asm-offsets.c
> @@ -50,6 +50,17 @@ int main(void)
> DEFINE(S_X5, offsetof(struct pt_regs, regs[5]));
> DEFINE(S_X6, offsetof(struct pt_regs, regs[6]));
> DEFINE(S_X7, offsetof(struct pt_regs, regs[7]));
> + DEFINE(S_X8, offsetof(struct pt_regs, regs[8]));
> + DEFINE(S_X10, offsetof(struct pt_regs, regs[10]));
> + DEFINE(S_X12, offsetof(struct pt_regs, regs[12]));
> + DEFINE(S_X14, offsetof(struct pt_regs, regs[14]));
> + DEFINE(S_X16, offsetof(struct pt_regs, regs[16]));
> + DEFINE(S_X18, offsetof(struct pt_regs, regs[18]));
> + DEFINE(S_X20, offsetof(struct pt_regs, regs[20]));
> + DEFINE(S_X22, offsetof(struct pt_regs, regs[22]));
> + DEFINE(S_X24, offsetof(struct pt_regs, regs[24]));
> + DEFINE(S_X26, offsetof(struct pt_regs, regs[26]));
> + DEFINE(S_X28, offsetof(struct pt_regs, regs[28]));
> DEFINE(S_LR, offsetof(struct pt_regs, regs[30]));
> DEFINE(S_SP, offsetof(struct pt_regs, sp));
> #ifdef CONFIG_COMPAT
> diff --git a/arch/arm64/kernel/kprobes.c b/arch/arm64/kernel/kprobes.c
> index ffc5affd..bd3f233 100644
> --- a/arch/arm64/kernel/kprobes.c
> +++ b/arch/arm64/kernel/kprobes.c
> @@ -532,6 +532,11 @@ int __kprobes longjmp_break_handler(struct kprobe *p, struct pt_regs *regs)
> return 1;
> }
>
> +void __kprobes __used *trampoline_probe_handler(struct pt_regs *regs)
> +{
> + return NULL;
> +}
> +
> int __init arch_init_kprobes(void)
> {
> return 0;
> diff --git a/arch/arm64/kernel/kprobes_trampoline.S b/arch/arm64/kernel/kprobes_trampoline.S
> new file mode 100644
> index 0000000..072b4e5
> --- /dev/null
> +++ b/arch/arm64/kernel/kprobes_trampoline.S
> @@ -0,0 +1,88 @@
> +/*
> + * trampoline entry and return code for kretprobes.
> + */
> +
> +#include <linux/linkage.h>
> +#include <asm/asm-offsets.h>
> +#include <asm/assembler.h>
> +
> + .text
> +
> +.macro save_all_base_regs ctxt
> + stp x0, x1, [\ctxt, #S_X0]
> + stp x2, x3, [\ctxt, #S_X2]
> + stp x4, x5, [\ctxt, #S_X4]
> + stp x6, x7, [\ctxt, #S_X6]
> + stp x8, x9, [\ctxt, #S_X8]
> + stp x10, x11, [\ctxt, #S_X10]
> + stp x12, x13, [\ctxt, #S_X12]
> + stp x14, x15, [\ctxt, #S_X14]
> + stp x16, x17, [\ctxt, #S_X16]
> + stp x18, x19, [\ctxt, #S_X18]
> + stp x20, x21, [\ctxt, #S_X20]
> + stp x22, x23, [\ctxt, #S_X22]
> + stp x24, x25, [\ctxt, #S_X24]
> + stp x26, x27, [\ctxt, #S_X26]
> + stp x28, x29, [\ctxt, #S_X28]
> + str lr, [\ctxt, #S_LR]
> + add x0, \ctxt, #S_FRAME_SIZE
> + str x0, [\ctxt, #S_SP]
Nit: this could also be rewritten as:
add x0, \ctxt, #S_FRAME_SIZE
stp lr, xo, [\ctxt, #S_LR]
Another thing worth noting is that since your macro saves all the
GP registers, only SP can be used for the ctxt parameter. This means
you're better off hardcoding SP in this macro, and not give the
illusion of being generic.
> +/*
> + * Construct a useful saved PSTATE
> + */
> + mrs x0, nzcv
> + and x0, x0, #0xf0000000
It'd be worth spelling this as (PSR_N_BIT | PSR_Z_BIT | PSR_C_BIT |
PSR_V_BIT)...
> + mrs x1, daif
> + and x1, x1, #0x3c0
... and this as (PSR_D_BIT | PSR_A_BIT | PSR_I_BIT | PSR_F_BIT).
> + orr x0, x0, x1
> + mrs x1, CurrentEL
> + and x1, x1, #12
I'd like this '12' to be a bit more explicit. How about (3 << 2)? I
tend to parse this kind of things much more easily...
> + lsl x1, x1, #21
What's that shift for? The various bits should be at the right place
already.
> + orr x0, x1, x0
> + mrs x1, SPSel
> + and x1, x1, #1
> + lsl x1, x1, #21
Same here (you're in single-step territory, which is probably not what
you want...).
> + orr x0, x1, x0
> + str x0, [\ctxt, #S_PSTATE]
> +.endm
> +
> +.macro restore_all_base_regs ctxt
Same remark about the pseudo-generic parameter.
> + ldr x0, [\ctxt, #S_PSTATE]
> + and x0, x0, #0xf0000000
Same remark about using the PSR_* macros.
> + msr nzcv, x0
> + ldp x0, x1, [\ctxt, #S_X0]
> + ldp x2, x3, [\ctxt, #S_X2]
> + ldp x4, x5, [\ctxt, #S_X4]
> + ldp x6, x7, [\ctxt, #S_X6]
> + ldp x8, x9, [\ctxt, #S_X8]
> + ldp x10, x11, [\ctxt, #S_X10]
> + ldp x12, x13, [\ctxt, #S_X12]
> + ldp x14, x15, [\ctxt, #S_X14]
> + ldp x16, x17, [\ctxt, #S_X16]
> + ldp x18, x19, [\ctxt, #S_X18]
> + ldp x20, x21, [\ctxt, #S_X20]
> + ldp x22, x23, [\ctxt, #S_X22]
> + ldp x24, x25, [\ctxt, #S_X24]
> + ldp x26, x27, [\ctxt, #S_X26]
> + ldp x28, x29, [\ctxt, #S_X28]
> +.endm
> +
> +ENTRY(kretprobe_trampoline)
> +
> + sub sp, sp, #S_FRAME_SIZE
> +
> + save_all_base_regs sp
> +
> + mov x0, sp
> + bl trampoline_probe_handler
> + /* Replace trampoline address in lr with actual
> + orig_ret_addr return address. */
> + mov lr, x0
> +
> + restore_all_base_regs sp
> +
> + add sp, sp, #S_FRAME_SIZE
> +
> + ret
> +
> +ENDPROC(kretprobe_trampoline)
Thanks,
M.
--
Jazz is not dead. It just smells funny.
On 13/03/2016:12:09:03 PM, Marc Zyngier wrote:
> On Wed, 9 Mar 2016 00:32:18 -0500
> David Long <[email protected]> wrote:
>
> > +pstate_check_t * const opcode_condition_checks[16] = {
> > + __check_eq, __check_ne, __check_cs, __check_cc,
> > + __check_mi, __check_pl, __check_vs, __check_vc,
> > + __check_hi, __check_ls, __check_ge, __check_lt,
> > + __check_gt, __check_le, __check_al, __check_al
>
> The very last entry seems wrong, or is at least the opposite of what
> the current code has. It should be something called __check_nv(), and
> always return false (condition code NEVER).
May be __check_nv() name is more appropriate as per definition, but shouldn't it
still return true, because TRM says:
"The condition code NV exists only to provide a valid disassembly of the 0b1111
encoding, otherwise its behavior is identical to AL"
~Pratyush
On Mon, 14 Mar 2016 09:34:55 +0530
Pratyush Anand <[email protected]> wrote:
Hi Pratyush,
> On 13/03/2016:12:09:03 PM, Marc Zyngier wrote:
> > On Wed, 9 Mar 2016 00:32:18 -0500
> > David Long <[email protected]> wrote:
> >
> > > +pstate_check_t * const opcode_condition_checks[16] = {
> > > + __check_eq, __check_ne, __check_cs, __check_cc,
> > > + __check_mi, __check_pl, __check_vs, __check_vc,
> > > + __check_hi, __check_ls, __check_ge, __check_lt,
> > > + __check_gt, __check_le, __check_al, __check_al
> >
> > The very last entry seems wrong, or is at least the opposite of what
> > the current code has. It should be something called __check_nv(), and
> > always return false (condition code NEVER).
>
> May be __check_nv() name is more appropriate as per definition, but shouldn't it
> still return true, because TRM says:
> "The condition code NV exists only to provide a valid disassembly of the 0b1111
> encoding, otherwise its behavior is identical to AL"
Indeed, I missed that. But this interpretation is for the A64
instruction set, and this array is also used by the new
arm32_check_condition. The condition code table for A32 seems to
completely ignore the 0b1111 code (there is simply no entry for it), and
it is only in the ConditionHolds pseudocode that you can see how this
is actually special-cased.
So I'm fine leaving the code as it is, but a comment and a pointer to
the ARMv8 ARM wouldn't go amiss.
Thanks,
M.
--
Jazz is not dead. It just smells funny.
David,
On 09/03/16 05:32, David Long wrote:
> From: "David A. Long" <[email protected]>
>
> Add HAVE_REGS_AND_STACK_ACCESS_API feature for arm64.
>
> Signed-off-by: David A. Long <[email protected]>
> ---
> arch/arm64/Kconfig | 1 +
> arch/arm64/include/asm/ptrace.h | 31 +++++++++++
> arch/arm64/kernel/ptrace.c | 117 ++++++++++++++++++++++++++++++++++++++++
> 3 files changed, 149 insertions(+)
>
> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
> index 8cc6228..4211b0d 100644
> --- a/arch/arm64/Kconfig
> +++ b/arch/arm64/Kconfig
> @@ -78,6 +78,7 @@ config ARM64
> select HAVE_PERF_EVENTS
> select HAVE_PERF_REGS
> select HAVE_PERF_USER_STACK_DUMP
> + select HAVE_REGS_AND_STACK_ACCESS_API
> select HAVE_RCU_TABLE_FREE
> select HAVE_SYSCALL_TRACEPOINTS
> select IOMMU_DMA if IOMMU_SUPPORT
> diff --git a/arch/arm64/include/asm/ptrace.h b/arch/arm64/include/asm/ptrace.h
> index e9e5467..7bd6445 100644
> --- a/arch/arm64/include/asm/ptrace.h
> +++ b/arch/arm64/include/asm/ptrace.h
> @@ -118,6 +118,8 @@ struct pt_regs {
> u64 syscallno;
> };
>
> +#define MAX_REG_OFFSET offsetof(struct user_pt_regs, pstate)
So here you're using user_pt_regs...
> +
> #define arch_has_single_step() (1)
>
> #ifdef CONFIG_COMPAT
> @@ -146,6 +148,35 @@ struct pt_regs {
> #define user_stack_pointer(regs) \
> (!compat_user_mode(regs) ? (regs)->sp : (regs)->compat_sp)
>
> +extern int regs_query_register_offset(const char *name);
> +extern const char *regs_query_register_name(unsigned int offset);
> +extern bool regs_within_kernel_stack(struct pt_regs *regs, unsigned long addr);
> +extern unsigned long regs_get_kernel_stack_nth(struct pt_regs *regs,
> + unsigned int n);
> +
> +/**
> + * regs_get_register() - get register value from its offset
> + * @regs: pt_regs from which register value is gotten
> + * @offset: offset number of the register.
> + *
> + * regs_get_register returns the value of a register whose offset from @regs.
> + * The @offset is the offset of the register in struct pt_regs.
Is that the offset in pt_regs? Or should it be in the actual regs array
instead? So far, this is the same thing, but that feels pretty fragile.
> + * If @offset is bigger than MAX_REG_OFFSET, this returns 0.
> + */
> +static inline u64 regs_get_register(struct pt_regs *regs,
> + unsigned int offset)
... and here this is pt_regs. I know that the structures are quite
similar, but some uniformity wouldn't hurt. Given that this series is
mostly concerned with kernel space, it should probably the latter rather
than the former.
> +{
> + if (unlikely(offset > MAX_REG_OFFSET))
> + return 0;
> + return *(u64 *)((u64)regs + offset);
Now that's a bit disgusting... You are assuming way too much about the
layout of pt_regs (imagine someone insert a new field right before the
union?). How about:
u64 *reg_array = regs->regs;
return reg_array[offset >> 3];
instead? I know the semantic is not the same, but I'd really like to see
something a bit more robust.
> +}
> +
> +/* Valid only for Kernel mode traps. */
> +static inline unsigned long kernel_stack_pointer(struct pt_regs *regs)
> +{
> + return regs->sp;
> +}
> +
> static inline unsigned long regs_return_value(struct pt_regs *regs)
> {
> return regs->regs[0];
> diff --git a/arch/arm64/kernel/ptrace.c b/arch/arm64/kernel/ptrace.c
> index ff7f132..efebf0f 100644
> --- a/arch/arm64/kernel/ptrace.c
> +++ b/arch/arm64/kernel/ptrace.c
> @@ -48,6 +48,123 @@
> #define CREATE_TRACE_POINTS
> #include <trace/events/syscalls.h>
>
> +struct pt_regs_offset {
> + const char *name;
> + int offset;
> +};
> +
> +#define REG_OFFSET_NAME(r) {.name = #r, .offset = offsetof(struct pt_regs, r)}
> +#define REG_OFFSET_END {.name = NULL, .offset = 0}
> +#define GPR_OFFSET_NAME(r) \
> + {.name = "x" #r, .offset = offsetof(struct pt_regs, regs[r])}
> +
> +static const struct pt_regs_offset regoffset_table[] = {
> + GPR_OFFSET_NAME(0),
> + GPR_OFFSET_NAME(1),
> + GPR_OFFSET_NAME(2),
> + GPR_OFFSET_NAME(3),
> + GPR_OFFSET_NAME(4),
> + GPR_OFFSET_NAME(5),
> + GPR_OFFSET_NAME(6),
> + GPR_OFFSET_NAME(7),
> + GPR_OFFSET_NAME(8),
> + GPR_OFFSET_NAME(9),
> + GPR_OFFSET_NAME(10),
> + GPR_OFFSET_NAME(11),
> + GPR_OFFSET_NAME(12),
> + GPR_OFFSET_NAME(13),
> + GPR_OFFSET_NAME(14),
> + GPR_OFFSET_NAME(15),
> + GPR_OFFSET_NAME(16),
> + GPR_OFFSET_NAME(17),
> + GPR_OFFSET_NAME(18),
> + GPR_OFFSET_NAME(19),
> + GPR_OFFSET_NAME(20),
> + GPR_OFFSET_NAME(21),
> + GPR_OFFSET_NAME(22),
> + GPR_OFFSET_NAME(23),
> + GPR_OFFSET_NAME(24),
> + GPR_OFFSET_NAME(25),
> + GPR_OFFSET_NAME(26),
> + GPR_OFFSET_NAME(27),
> + GPR_OFFSET_NAME(28),
> + GPR_OFFSET_NAME(29),
> + GPR_OFFSET_NAME(30),
> + {.name = "lr", .offset = offsetof(struct pt_regs, regs[30])},
> + REG_OFFSET_NAME(sp),
> + REG_OFFSET_NAME(pc),
> + REG_OFFSET_NAME(pstate),
> + REG_OFFSET_END,
> +};
> +
> +/**
> + * regs_query_register_offset() - query register offset from its name
> + * @name: the name of a register
> + *
> + * regs_query_register_offset() returns the offset of a register in struct
> + * pt_regs from its name. If the name is invalid, this returns -EINVAL;
> + */
> +int regs_query_register_offset(const char *name)
> +{
> + const struct pt_regs_offset *roff;
> +
> + for (roff = regoffset_table; roff->name != NULL; roff++)
> + if (!strcmp(roff->name, name))
> + return roff->offset;
> + return -EINVAL;
> +}
> +
> +/**
> + * regs_query_register_name() - query register name from its offset
> + * @offset: the offset of a register in struct pt_regs.
> + *
> + * regs_query_register_name() returns the name of a register from its
> + * offset in struct pt_regs. If the @offset is invalid, this returns NULL;
> + */
> +const char *regs_query_register_name(unsigned int offset)
> +{
> + const struct pt_regs_offset *roff;
> +
> + for (roff = regoffset_table; roff->name != NULL; roff++)
> + if (roff->offset == offset)
> + return roff->name;
> + return NULL;
> +}
> +
> +/**
> + * regs_within_kernel_stack() - check the address in the stack
> + * @regs: pt_regs which contains kernel stack pointer.
> + * @addr: address which is checked.
> + *
> + * regs_within_kernel_stack() checks @addr is within the kernel stack page(s).
> + * If @addr is within the kernel stack, it returns true. If not, returns false.
> + */
> +bool regs_within_kernel_stack(struct pt_regs *regs, unsigned long addr)
> +{
> + return ((addr & ~(THREAD_SIZE - 1)) ==
> + (kernel_stack_pointer(regs) & ~(THREAD_SIZE - 1)));
> +}
> +
> +/**
> + * regs_get_kernel_stack_nth() - get Nth entry of the stack
> + * @regs: pt_regs which contains kernel stack pointer.
> + * @n: stack entry number.
> + *
> + * regs_get_kernel_stack_nth() returns @n th entry of the kernel stack which
> + * is specified by @regs. If the @n th entry is NOT in the kernel stack,
> + * this returns 0.
> + */
> +unsigned long regs_get_kernel_stack_nth(struct pt_regs *regs, unsigned int n)
> +{
> + unsigned long *addr = (unsigned long *)kernel_stack_pointer(regs);
> +
> + addr += n;
> + if (regs_within_kernel_stack(regs, (unsigned long)addr))
> + return *addr;
> + else
> + return 0;
> +}
> +
> /*
> * TODO: does not yet catch signals sent when the child dies.
> * in exit.c or in signal.c.
>
Thanks,
M.
--
Jazz is not dead. It just smells funny...
Hi David,
On 09/03/16 05:32, David Long wrote:
> From: "David A. Long" <[email protected]>
> diff --git a/arch/arm64/lib/copy_from_user.S b/arch/arm64/lib/copy_from_user.S
> index 4699cd7..0ac2131 100644
> --- a/arch/arm64/lib/copy_from_user.S
> +++ b/arch/arm64/lib/copy_from_user.S
> @@ -66,6 +66,7 @@
> .endm
>
> end .req x5
> + .section .kprobes.text,"ax",%progbits
> ENTRY(__copy_from_user)
> ALTERNATIVE("nop", __stringify(SET_PSTATE_PAN(0)), ARM64_HAS_PAN, \
> CONFIG_ARM64_PAN)
> diff --git a/arch/arm64/lib/copy_to_user.S b/arch/arm64/lib/copy_to_user.S
> index 7512bbb..e4eb84c 100644
> --- a/arch/arm64/lib/copy_to_user.S
> +++ b/arch/arm64/lib/copy_to_user.S
> @@ -65,6 +65,7 @@
> .endm
>
> end .req x5
> + .section .kprobes.text,"ax",%progbits
> ENTRY(__copy_to_user)
> ALTERNATIVE("nop", __stringify(SET_PSTATE_PAN(0)), ARM64_HAS_PAN, \
> CONFIG_ARM64_PAN)
>
If I understand this correctly - you can't kprobe these ldr/str instructions as
the fault handler wouldn't find kprobe's out-of line version of the instruction
in the exception table... but why only these two functions? (for library
functions, we also have clear_user() and copy_in_user()...)
The get_user()/put_user() stuff in uaccess.h gets inlined all over the kernel, I
don't think its feasible to put all of these in a separate section.
Is it feasible to search the exception table at runtime instead? If an
address-to-be-kprobed appears in the list, we know it could generate exceptions,
so we should report that we can't probe this address. That would catch all of
the library functions, all the places uaccess.h was inlined, and anything new
that gets invented in the future.
> Currrently taking exceptions when accessing user data from a kprobe'd
(Nit: Currently)
Thanks,
James
On 15/03/2016:06:47:52 PM, James Morse wrote:
> Hi David,
>
> On 09/03/16 05:32, David Long wrote:
> > From: "David A. Long" <[email protected]>
> > diff --git a/arch/arm64/lib/copy_from_user.S b/arch/arm64/lib/copy_from_user.S
> > index 4699cd7..0ac2131 100644
> > --- a/arch/arm64/lib/copy_from_user.S
> > +++ b/arch/arm64/lib/copy_from_user.S
> > @@ -66,6 +66,7 @@
> > .endm
> >
> > end .req x5
> > + .section .kprobes.text,"ax",%progbits
> > ENTRY(__copy_from_user)
> > ALTERNATIVE("nop", __stringify(SET_PSTATE_PAN(0)), ARM64_HAS_PAN, \
> > CONFIG_ARM64_PAN)
> > diff --git a/arch/arm64/lib/copy_to_user.S b/arch/arm64/lib/copy_to_user.S
> > index 7512bbb..e4eb84c 100644
> > --- a/arch/arm64/lib/copy_to_user.S
> > +++ b/arch/arm64/lib/copy_to_user.S
> > @@ -65,6 +65,7 @@
> > .endm
> >
> > end .req x5
> > + .section .kprobes.text,"ax",%progbits
> > ENTRY(__copy_to_user)
> > ALTERNATIVE("nop", __stringify(SET_PSTATE_PAN(0)), ARM64_HAS_PAN, \
> > CONFIG_ARM64_PAN)
> >
>
> If I understand this correctly - you can't kprobe these ldr/str instructions as
> the fault handler wouldn't find kprobe's out-of line version of the instruction
> in the exception table... but why only these two functions? (for library
> functions, we also have clear_user() and copy_in_user()...)
May be not clear_user() because those are inlined, but may be __clear_user().
There can be many other functions (see [1], [2] and can be many more) which need
to be blacklisted, but I think they can always be added latter on, and atleast
this aspect should not hinder inclusion of these patches.
>
> The get_user()/put_user() stuff in uaccess.h gets inlined all over the kernel, I
> don't think its feasible to put all of these in a separate section.
Yes, It does not seem possible to blacklist inlined functions. There can be
some other places like valid kprobable instructions in atomic context, .word
instruction having data as valid instruction, etc... So, probably its not
possible to make 100% safe, but yes wherever possible, we should take care.
Infact, other ARCHs are also not completely safe. One can try to instrument
kprobe on all the symbols in Kallsyms on an x86_64 machine and kernel crashes.
>
> Is it feasible to search the exception table at runtime instead? If an
> address-to-be-kprobed appears in the list, we know it could generate exceptions,
> so we should report that we can't probe this address. That would catch all of
> the library functions, all the places uaccess.h was inlined, and anything new
> that gets invented in the future.
Sorry, probably I could not get it. How can an inlined addresses range be placed
in exception table or any other code area.
~Pratyush
[1] https://github.com/pratyushanand/linux/commit/855bc4dbb98ceafac4c933e00d203b1cd7ee9ca4
[2] https://github.com/pratyushanand/linux/commit/8bc586d6f767240e9ffa582f45a9ad11de47ecfb
Hi Pratyush,
On 16/03/16 05:43, Pratyush Anand wrote:
> On 15/03/2016:06:47:52 PM, James Morse wrote:
>> If I understand this correctly - you can't kprobe these ldr/str instructions
>> as the fault handler wouldn't find kprobe's out-of line version of the
>> instruction in the exception table... but why only these two functions? (for
>> library functions, we also have clear_user() and copy_in_user()...)
>
> May be not clear_user() because those are inlined, but may be __clear_user().
You're right - the other library functions in that same directory is what I meant..
>> Is it feasible to search the exception table at runtime instead? If an
>> address-to-be-kprobed appears in the list, we know it could generate exceptions,
>> so we should report that we can't probe this address. That would catch all of
>> the library functions, all the places uaccess.h was inlined, and anything new
>> that gets invented in the future.
>
> Sorry, probably I could not get it. How can an inlined addresses range be placed
> in exception table or any other code area.
Ah, not a section or code area, sorry I wasn't clear:
When a fault happens in the kernel, the fault handler
(/arch/arm64/mm/fault.c:do_page_fault()) calls search_exception_tables(regs->pc)
to see if the faulting address has a 'fixup' registered. If it does, the fixup
causes -EFAULT to be returned, if not it ends up in die().
The horrible block of assembler in
arch/arm64/include/asm/uaccess.h:__get_user_asm() adds the address of the
instruction that is allowed to fault to the __ex_table section:
> .section __ex_table,"a"
> .align 3
> .quad 1b, 3b
> .previous
Here 1b is the address of the instruction that can fault, and 3b is the fixup
that moves -EFAULT into the return value.
This works for get_user() and friends which are inlined all over the kernel. It
even works for modules, as there is an exception table for each module which is
searched by kernel/module.c:search_module_extables().
This list of addresses that can fault already exists, there is even an API
function to check for a given address. Grabbing the nearest vmlinux, there are
~1300 entries in the __ex_table section, this patch blacklists two of them,
using search_exception_tables() obviously blacklists them all.
I've had a quick look at x86 and sparc, it looks like they allowed probed
instructions to fault, do_page_fault()->kprobes_fault()->kprobe_fault_handler()
- which uses the original probed address with search_exception_tables() to find
and run the fixup. I doubt this is needed in an initial version of kprobes,
(maybe its later in this series - I haven't read all the way through it yet).
Thanks,
James
Hi James,
On 16/03/2016:10:27:22 AM, James Morse wrote:
> Hi Pratyush,
>
> On 16/03/16 05:43, Pratyush Anand wrote:
> > On 15/03/2016:06:47:52 PM, James Morse wrote:
> >> If I understand this correctly - you can't kprobe these ldr/str instructions
> >> as the fault handler wouldn't find kprobe's out-of line version of the
> >> instruction in the exception table... but why only these two functions? (for
> >> library functions, we also have clear_user() and copy_in_user()...)
> >
> > May be not clear_user() because those are inlined, but may be __clear_user().
>
> You're right - the other library functions in that same directory is what I meant..
>
> >> Is it feasible to search the exception table at runtime instead? If an
> >> address-to-be-kprobed appears in the list, we know it could generate exceptions,
> >> so we should report that we can't probe this address. That would catch all of
> >> the library functions, all the places uaccess.h was inlined, and anything new
> >> that gets invented in the future.
> >
> > Sorry, probably I could not get it. How can an inlined addresses range be placed
> > in exception table or any other code area.
>
> Ah, not a section or code area, sorry I wasn't clear:
>
> When a fault happens in the kernel, the fault handler
> (/arch/arm64/mm/fault.c:do_page_fault()) calls search_exception_tables(regs->pc)
> to see if the faulting address has a 'fixup' registered. If it does, the fixup
> causes -EFAULT to be returned, if not it ends up in die().
>
> The horrible block of assembler in
> arch/arm64/include/asm/uaccess.h:__get_user_asm() adds the address of the
> instruction that is allowed to fault to the __ex_table section:
> > .section __ex_table,"a"
> > .align 3
> > .quad 1b, 3b
> > .previous
>
> Here 1b is the address of the instruction that can fault, and 3b is the fixup
> that moves -EFAULT into the return value.
>
> This works for get_user() and friends which are inlined all over the kernel. It
> even works for modules, as there is an exception table for each module which is
> searched by kernel/module.c:search_module_extables().
>
> This list of addresses that can fault already exists, there is even an API
> function to check for a given address. Grabbing the nearest vmlinux, there are
> ~1300 entries in the __ex_table section, this patch blacklists two of them,
> using search_exception_tables() obviously blacklists them all.
Thanks a lot for explaining it. Got it now. So agreeing to your idea. But....
>
>
> I've had a quick look at x86 and sparc, it looks like they allowed probed
> instructions to fault, do_page_fault()->kprobes_fault()->kprobe_fault_handler()
> - which uses the original probed address with search_exception_tables() to find
> and run the fixup. I doubt this is needed in an initial version of kprobes,
> (maybe its later in this series - I haven't read all the way through it yet).
Hummmm..We do have fixup_exception() in arm64 kprobe_fault_handler(). So, it
should have worked, without this patch.
@David: This patch was added in v9 and fixup_exception() had been dropped in v9.
Since, dropping of fixup_exception() also caused to fail some systemtap test
cases, so it was added back in v10. I wonder if we really need this patch.
May be you can try to run related test case by dropping this patch.
Thanks James for bringing this out.
~Pratyush
>From: "David A. Long" <[email protected]>
>
>Currrently taking exceptions when accessing user data from a kprobe'd
>instruction doesn't work. Avoid this situation by blacklisting the relevant
>functions.
>
>Signed-off-by: David A. Long <[email protected]>
Looks good to me.
Reviewed-by: Masami Hiramatsu <[email protected]>
Thanks,
>---
> arch/arm64/lib/copy_from_user.S | 1 +
> arch/arm64/lib/copy_to_user.S | 1 +
> 2 files changed, 2 insertions(+)
>
>diff --git a/arch/arm64/lib/copy_from_user.S b/arch/arm64/lib/copy_from_user.S
>index 4699cd7..0ac2131 100644
>--- a/arch/arm64/lib/copy_from_user.S
>+++ b/arch/arm64/lib/copy_from_user.S
>@@ -66,6 +66,7 @@
> .endm
>
> end .req x5
>+ .section .kprobes.text,"ax",%progbits
> ENTRY(__copy_from_user)
> ALTERNATIVE("nop", __stringify(SET_PSTATE_PAN(0)), ARM64_HAS_PAN, \
> CONFIG_ARM64_PAN)
>diff --git a/arch/arm64/lib/copy_to_user.S b/arch/arm64/lib/copy_to_user.S
>index 7512bbb..e4eb84c 100644
>--- a/arch/arm64/lib/copy_to_user.S
>+++ b/arch/arm64/lib/copy_to_user.S
>@@ -65,6 +65,7 @@
> .endm
>
> end .req x5
>+ .section .kprobes.text,"ax",%progbits
> ENTRY(__copy_to_user)
> ALTERNATIVE("nop", __stringify(SET_PSTATE_PAN(0)), ARM64_HAS_PAN, \
> CONFIG_ARM64_PAN)
>--
>2.5.0
>
>
>_______________________________________________
>linux-arm-kernel mailing list
>[email protected]
>http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
Hi,
>From: Sandeepa Prabhu <[email protected]>
>
>The pre-handler of this special 'trampoline' kprobe executes the return
>probe handler functions and restores original return address in ELR_EL1.
>This way the saved pt_regs still hold the original register context to be
>carried back to the probed kernel function.
This patch seems not well separated.
>diff --git a/arch/arm64/kernel/kprobes.c b/arch/arm64/kernel/kprobes.c
>index bd3f233..13d3333 100644
>--- a/arch/arm64/kernel/kprobes.c
>+++ b/arch/arm64/kernel/kprobes.c
[snip]
>+void __kprobes arch_prepare_kretprobe(struct kretprobe_instance *ri,
>+ struct pt_regs *regs)
>+{
>+ ri->ret_addr = (kprobe_opcode_t *)regs->regs[30];
>+
>+ /* replace return addr (x30) with trampoline */
>+ regs->regs[30] = (long)&kretprobe_trampoline;
So, where is the kretprobe_trampoline? It seems that function is
defined in other patch.
>+}
>+
>+int __kprobes arch_trampoline_kprobe(struct kprobe *p)
>+{
>+ return 0;
> }
And what this function is for??
Thank you,
>
> int __init arch_init_kprobes(void)
>--
>2.5.0
>
>
>_______________________________________________
>linux-arm-kernel mailing list
>[email protected]
>http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
>From: ƽ?????? / HIRAMATU??MASAMI [mailto:[email protected]]
>
>Hi,
>
>>From: Sandeepa Prabhu <[email protected]>
>>
>>The pre-handler of this special 'trampoline' kprobe executes the return
>>probe handler functions and restores original return address in ELR_EL1.
>>This way the saved pt_regs still hold the original register context to be
>>carried back to the probed kernel function.
>
>This patch seems not well separated.
>
>>diff --git a/arch/arm64/kernel/kprobes.c b/arch/arm64/kernel/kprobes.c
>>index bd3f233..13d3333 100644
>>--- a/arch/arm64/kernel/kprobes.c
>>+++ b/arch/arm64/kernel/kprobes.c
>
>[snip]
>
>>+void __kprobes arch_prepare_kretprobe(struct kretprobe_instance *ri,
>>+ struct pt_regs *regs)
>>+{
>>+ ri->ret_addr = (kprobe_opcode_t *)regs->regs[30];
>>+
>>+ /* replace return addr (x30) with trampoline */
>>+ regs->regs[30] = (long)&kretprobe_trampoline;
>
>So, where is the kretprobe_trampoline? It seems that function is
>defined in other patch.
>
>>+}
>>+
>>+int __kprobes arch_trampoline_kprobe(struct kprobe *p)
>>+{
>>+ return 0;
>> }
>
>And what this function is for??
Ah, sorry, this was my fault. Yes, this function is required.
But this implementation also means there is an asm-based trampoline
function which should be included in this patch.
David, could you tell me the repository which I can get the latest
version of this series? I'd like to see the whole code of kprobes/arm64.
Thank you,
On 03/11/2016 01:07 PM, James Morse wrote:
> Hi David,
>
> On 09/03/16 05:32, David Long wrote:
>> From: "David A. Long" <[email protected]>
>>
>> Add HAVE_REGS_AND_STACK_ACCESS_API feature for arm64.
>>
>> Signed-off-by: David A. Long <[email protected]>
>
>> diff --git a/arch/arm64/kernel/ptrace.c b/arch/arm64/kernel/ptrace.c
>> index ff7f132..efebf0f 100644
>> --- a/arch/arm64/kernel/ptrace.c
>> +++ b/arch/arm64/kernel/ptrace.c
>
> [ ... SNIP ... ]
>
>> +/**
>> + * regs_within_kernel_stack() - check the address in the stack
>> + * @regs: pt_regs which contains kernel stack pointer.
>> + * @addr: address which is checked.
>> + *
>> + * regs_within_kernel_stack() checks @addr is within the kernel stack page(s).
>> + * If @addr is within the kernel stack, it returns true. If not, returns false.
>> + */
>> +bool regs_within_kernel_stack(struct pt_regs *regs, unsigned long addr)
>> +{
>> + return ((addr & ~(THREAD_SIZE - 1)) ==
>> + (kernel_stack_pointer(regs) & ~(THREAD_SIZE - 1)));
>
> I'm not sure where this is called from, but if kernel_stack_pointer(regs) could
> ever point into an irq_stack you will get the wrong result.
>
> arch/arm64/include/asm/irq.h has 'on_irq_stack(sp, cpu)' which should help,
> although you will need to check the bounds of the irq_stack separately.
>
>
> The horrible details...
>
> From arch/arm64/kernel/irq.c:20
>> /* irq stack only needs to be 16 byte aligned - not IRQ_STACK_SIZE aligned. */
>> DEFINE_PER_CPU(unsigned long [IRQ_STACK_SIZE/sizeof(long)], irq_stack)
>> __aligned(16);
>
> This was because per-cpu variables can be at-most page aligned.
> 6cdf9c7ca687 ("arm64: Store struct thread_info in sp_el0") changed
> current_thread_info() to work on these weirdly aligned irq_stacks.
>
>
> Thanks,
>
> James
>
>
It looks like this is ultimately used (currently) only by the
arch-independent kprobes tracing code. But it does seem like this will
be recording the wrong data when stack contents are being traced from
interrupt routine probes. I will put a fix in for the next spin.
Thanks,
-dl
On 17/03/2016:01:27:26 PM, Pratyush Anand wrote:
> @David: This patch was added in v9 and fixup_exception() had been dropped in v9.
> Since, dropping of fixup_exception() also caused to fail some systemtap test
> cases, so it was added back in v10. I wonder if we really need this patch.
> May be you can try to run related test case by dropping this patch.
Had a closer look to the code, and noticed that fixup_exception() does not have
any role in handling of page fault of copy_to_user(). Then, why do we have the
problem.
Probably, I can see why does not it work. So, when we are single stepping an
instruction and page fault occurs, we will come to el1_da in entry.S. Here, we
do enable_dbg. As soon as we will do this, we will start receiving single step
exception after each instruction (not sure, probably for each alternate
instruction). Since, there will not be any matching single step handler for
these instructions, so we will see warning "Unexpected kernel single-step
exception at EL1".
So, I think, we should
(1) may be do not enable debug for el1_da, or
(2) enable_dbg only when single stepping is not enabled, or
(3) or disable single stepping during el1_da execution.
(1) will solve the issue for sure, but not sure if it could be the best choice.
Will, what do you suggest?
~Pratyush
Hi Pratyush,
On 18/03/16 13:29, Pratyush Anand wrote:
> Probably, I can see why does not it work. So, when we are single stepping an
> instruction and page fault occurs, we will come to el1_da in entry.S. Here, we
> do enable_dbg. As soon as we will do this, we will start receiving single step
> exception after each instruction (not sure, probably for each alternate
> instruction). Since, there will not be any matching single step handler for
> these instructions, so we will see warning "Unexpected kernel single-step
> exception at EL1".
>
> So, I think, we should
>
> (1) may be do not enable debug for el1_da, or
> (2) enable_dbg only when single stepping is not enabled, or
> (3) or disable single stepping during el1_da execution.
>
> (1) will solve the issue for sure, but not sure if it could be the best choice.
A variation on (3):
In kernel/entry.S when entered from EL0 we test for TIF_SINGLESTEP in the
thread_info flags, and use disable_step_tsk/enable_step_tsk to save/restore the
single-step state.
Could we do this regardless of which EL we came from?
Thanks,
James
Hi James,
On 18/03/2016:02:02:49 PM, James Morse wrote:
> Hi Pratyush,
>
> On 18/03/16 13:29, Pratyush Anand wrote:
> > Probably, I can see why does not it work. So, when we are single stepping an
> > instruction and page fault occurs, we will come to el1_da in entry.S. Here, we
> > do enable_dbg. As soon as we will do this, we will start receiving single step
> > exception after each instruction (not sure, probably for each alternate
> > instruction). Since, there will not be any matching single step handler for
> > these instructions, so we will see warning "Unexpected kernel single-step
> > exception at EL1".
> >
> > So, I think, we should
> >
> > (1) may be do not enable debug for el1_da, or
> > (2) enable_dbg only when single stepping is not enabled, or
> > (3) or disable single stepping during el1_da execution.
> >
> > (1) will solve the issue for sure, but not sure if it could be the best choice.
>
> A variation on (3):
>
> In kernel/entry.S when entered from EL0 we test for TIF_SINGLESTEP in the
> thread_info flags, and use disable_step_tsk/enable_step_tsk to save/restore the
> single-step state.
>
> Could we do this regardless of which EL we came from?
Thanks for another idea. I think, we can not do this as it is, because
TIF_SINGLESTEP will not be set for kprobe events. But, we can introduce a
variant disable_step_kernel and enable_step_kernel, which can be called in
el1_da.
I will write a test case to reproduce the issue without this patch, and then
will do test with a patch based on something like above.
~Pratyush
Hi Pratyush,
On 18/03/16 14:43, Pratyush Anand wrote:
> On 18/03/2016:02:02:49 PM, James Morse wrote:
>> In kernel/entry.S when entered from EL0 we test for TIF_SINGLESTEP in the
>> thread_info flags, and use disable_step_tsk/enable_step_tsk to save/restore the
>> single-step state.
>>
>> Could we do this regardless of which EL we came from?
>
> Thanks for another idea. I think, we can not do this as it is, because
> TIF_SINGLESTEP will not be set for kprobe events.
Hmmm, I see kernel_enable_single_step() doesn't set it, but setup_singlestep()
in patch 5 could...
There is probably a good reason its never set for a kernel thread, I will have a
look at where else it is used.
> But, we can introduce a
> variant disable_step_kernel and enable_step_kernel, which can be called in
> el1_da.
What about sp/pc misalignment, or undefined instructions?
Or worse... an irq occurs during your el1_da call (el1_da may re-enable irqs).
el1_irq doesn't know you were careful not to unmask debug exceptions, it blindly
turns them back on.
The problem is the 'single step me' bit is still set, save/restoring it will
save us having to consider every interaction, (and then missing some!).
It would also mean you don't have to disable interrupts while single stepping in
patch 5 (comment above kprobes_save_local_irqflag()).
Thanks,
James
Hi James,
On 18/03/2016:06:12:20 PM, James Morse wrote:
> Hi Pratyush,
>
> On 18/03/16 14:43, Pratyush Anand wrote:
> > On 18/03/2016:02:02:49 PM, James Morse wrote:
> >> In kernel/entry.S when entered from EL0 we test for TIF_SINGLESTEP in the
> >> thread_info flags, and use disable_step_tsk/enable_step_tsk to save/restore the
> >> single-step state.
> >>
> >> Could we do this regardless of which EL we came from?
> >
> > Thanks for another idea. I think, we can not do this as it is, because
> > TIF_SINGLESTEP will not be set for kprobe events.
>
> Hmmm, I see kernel_enable_single_step() doesn't set it, but setup_singlestep()
> in patch 5 could...
>
> There is probably a good reason its never set for a kernel thread, I will have a
> look at where else it is used.
>
>
> > But, we can introduce a
> > variant disable_step_kernel and enable_step_kernel, which can be called in
> > el1_da.
>
> What about sp/pc misalignment, or undefined instructions?
> Or worse... an irq occurs during your el1_da call (el1_da may re-enable irqs).
> el1_irq doesn't know you were careful not to unmask debug exceptions, it blindly
> turns them back on.
>
> The problem is the 'single step me' bit is still set, save/restoring it will
> save us having to consider every interaction, (and then missing some!).
>
> It would also mean you don't have to disable interrupts while single stepping in
> patch 5 (comment above kprobes_save_local_irqflag()).
I see.
kernel_enable_single_step() is called from watchpoint and kgdb handler. It seems
to me that, similar issue may arise there as well. So, it would be a good idea
to set TIF_SINGLESTEP in kernel_enable_single_step() and clear in
kernel_disable_single_step().
Meanwhile, I prepared a test case to reproduce the issue without this patch.
Instrumented a kprobe at an instruction of __copy_to_user() which stores in user
space memory. I can see a sea of messages "Unexpected kernel single-step
exception at EL1" within few seconds. While with patch[1] applied, I do not see
any such messages.
May be I can send [1] as RFC and seek feedback.
~Pratyush
[1] https://github.com/pratyushanand/linux/commit/7623c8099ac22eaa00e7e0f52430f7a4bd154652
On 03/15/2016 07:04 AM, Marc Zyngier wrote:
> David,
>
> On 09/03/16 05:32, David Long wrote:
>> From: "David A. Long" <[email protected]>
>>
>> Add HAVE_REGS_AND_STACK_ACCESS_API feature for arm64.
>>
>> Signed-off-by: David A. Long <[email protected]>
>> ---
>> arch/arm64/Kconfig | 1 +
>> arch/arm64/include/asm/ptrace.h | 31 +++++++++++
>> arch/arm64/kernel/ptrace.c | 117 ++++++++++++++++++++++++++++++++++++++++
>> 3 files changed, 149 insertions(+)
>>
>> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
>> index 8cc6228..4211b0d 100644
>> --- a/arch/arm64/Kconfig
>> +++ b/arch/arm64/Kconfig
>> @@ -78,6 +78,7 @@ config ARM64
>> select HAVE_PERF_EVENTS
>> select HAVE_PERF_REGS
>> select HAVE_PERF_USER_STACK_DUMP
>> + select HAVE_REGS_AND_STACK_ACCESS_API
>> select HAVE_RCU_TABLE_FREE
>> select HAVE_SYSCALL_TRACEPOINTS
>> select IOMMU_DMA if IOMMU_SUPPORT
>> diff --git a/arch/arm64/include/asm/ptrace.h b/arch/arm64/include/asm/ptrace.h
>> index e9e5467..7bd6445 100644
>> --- a/arch/arm64/include/asm/ptrace.h
>> +++ b/arch/arm64/include/asm/ptrace.h
>> @@ -118,6 +118,8 @@ struct pt_regs {
>> u64 syscallno;
>> };
>>
>> +#define MAX_REG_OFFSET offsetof(struct user_pt_regs, pstate)
>
> So here you're using user_pt_regs...
>
Changed it to pt_regs, and removed the typecast in the
instruction_pointer() define to get things to compile again.
>> +
>> #define arch_has_single_step() (1)
>>
>> #ifdef CONFIG_COMPAT
>> @@ -146,6 +148,35 @@ struct pt_regs {
>> #define user_stack_pointer(regs) \
>> (!compat_user_mode(regs) ? (regs)->sp : (regs)->compat_sp)
>>
>> +extern int regs_query_register_offset(const char *name);
>> +extern const char *regs_query_register_name(unsigned int offset);
>> +extern bool regs_within_kernel_stack(struct pt_regs *regs, unsigned long addr);
>> +extern unsigned long regs_get_kernel_stack_nth(struct pt_regs *regs,
>> + unsigned int n);
>> +
>> +/**
>> + * regs_get_register() - get register value from its offset
>> + * @regs: pt_regs from which register value is gotten
>> + * @offset: offset number of the register.
>> + *
>> + * regs_get_register returns the value of a register whose offset from @regs.
>> + * The @offset is the offset of the register in struct pt_regs.
>
> Is that the offset in pt_regs? Or should it be in the actual regs array
> instead? So far, this is the same thing, but that feels pretty fragile.
>
It's the offset within the entire structure, including sp, pc, and pstate.
>> + * If @offset is bigger than MAX_REG_OFFSET, this returns 0.
>> + */
>> +static inline u64 regs_get_register(struct pt_regs *regs,
>> + unsigned int offset)
>
> ... and here this is pt_regs. I know that the structures are quite
> similar, but some uniformity wouldn't hurt. Given that this series is
> mostly concerned with kernel space, it should probably the latter rather
> than the former.
>
Yes, pt_regs it is.
>> +{
>> + if (unlikely(offset > MAX_REG_OFFSET))
>> + return 0;
>> + return *(u64 *)((u64)regs + offset);
>
> Now that's a bit disgusting... You are assuming way too much about the
> layout of pt_regs (imagine someone insert a new field right before the
> union?). How about:
>
> u64 *reg_array = regs->regs;
> return reg_array[offset >> 3];
>
> instead? I know the semantic is not the same, but I'd really like to see
> something a bit more robust.
This index is supposed to provide access to useful info saved in pt_regs
structure including regs, pc, sp, and pstate. This is how it works on
the other architectures that provide this feature. The index is going
to be something that was previously looked up with the
regs_query_register_offset() call, which ultimately looks up the offset
in the structure using the field name and a table created at compile
time (see regoffset_table in include/asm/ptrace.h). Additions to the
structure will not create a problem since the numeric value is not
hardcoded in existing code (although thought should be given about
whether new names should be added to regs_query_register_name()'s lookup
table to make them accessible).
Changing this mechanism would require changes to the generic code and to
each other architecture, while preserving the architecture-independent
nature of this API. Changing the API would likely affect performance
tools and debugger(s) too.
>> +}
>> +
>> +/* Valid only for Kernel mode traps. */
>> +static inline unsigned long kernel_stack_pointer(struct pt_regs *regs)
>> +{
>> + return regs->sp;
>> +}
>> +
>> static inline unsigned long regs_return_value(struct pt_regs *regs)
>> {
>> return regs->regs[0];
>> diff --git a/arch/arm64/kernel/ptrace.c b/arch/arm64/kernel/ptrace.c
>> index ff7f132..efebf0f 100644
>> --- a/arch/arm64/kernel/ptrace.c
>> +++ b/arch/arm64/kernel/ptrace.c
>> @@ -48,6 +48,123 @@
>> #define CREATE_TRACE_POINTS
>> #include <trace/events/syscalls.h>
>>
>> +struct pt_regs_offset {
>> + const char *name;
>> + int offset;
>> +};
>> +
>> +#define REG_OFFSET_NAME(r) {.name = #r, .offset = offsetof(struct pt_regs, r)}
>> +#define REG_OFFSET_END {.name = NULL, .offset = 0}
>> +#define GPR_OFFSET_NAME(r) \
>> + {.name = "x" #r, .offset = offsetof(struct pt_regs, regs[r])}
>> +
>> +static const struct pt_regs_offset regoffset_table[] = {
>> + GPR_OFFSET_NAME(0),
>> + GPR_OFFSET_NAME(1),
>> + GPR_OFFSET_NAME(2),
>> + GPR_OFFSET_NAME(3),
>> + GPR_OFFSET_NAME(4),
>> + GPR_OFFSET_NAME(5),
>> + GPR_OFFSET_NAME(6),
>> + GPR_OFFSET_NAME(7),
>> + GPR_OFFSET_NAME(8),
>> + GPR_OFFSET_NAME(9),
>> + GPR_OFFSET_NAME(10),
>> + GPR_OFFSET_NAME(11),
>> + GPR_OFFSET_NAME(12),
>> + GPR_OFFSET_NAME(13),
>> + GPR_OFFSET_NAME(14),
>> + GPR_OFFSET_NAME(15),
>> + GPR_OFFSET_NAME(16),
>> + GPR_OFFSET_NAME(17),
>> + GPR_OFFSET_NAME(18),
>> + GPR_OFFSET_NAME(19),
>> + GPR_OFFSET_NAME(20),
>> + GPR_OFFSET_NAME(21),
>> + GPR_OFFSET_NAME(22),
>> + GPR_OFFSET_NAME(23),
>> + GPR_OFFSET_NAME(24),
>> + GPR_OFFSET_NAME(25),
>> + GPR_OFFSET_NAME(26),
>> + GPR_OFFSET_NAME(27),
>> + GPR_OFFSET_NAME(28),
>> + GPR_OFFSET_NAME(29),
>> + GPR_OFFSET_NAME(30),
>> + {.name = "lr", .offset = offsetof(struct pt_regs, regs[30])},
>> + REG_OFFSET_NAME(sp),
>> + REG_OFFSET_NAME(pc),
>> + REG_OFFSET_NAME(pstate),
>> + REG_OFFSET_END,
>> +};
>> +
>> +/**
>> + * regs_query_register_offset() - query register offset from its name
>> + * @name: the name of a register
>> + *
>> + * regs_query_register_offset() returns the offset of a register in struct
>> + * pt_regs from its name. If the name is invalid, this returns -EINVAL;
>> + */
>> +int regs_query_register_offset(const char *name)
>> +{
>> + const struct pt_regs_offset *roff;
>> +
>> + for (roff = regoffset_table; roff->name != NULL; roff++)
>> + if (!strcmp(roff->name, name))
>> + return roff->offset;
>> + return -EINVAL;
>> +}
>> +
>> +/**
>> + * regs_query_register_name() - query register name from its offset
>> + * @offset: the offset of a register in struct pt_regs.
>> + *
>> + * regs_query_register_name() returns the name of a register from its
>> + * offset in struct pt_regs. If the @offset is invalid, this returns NULL;
>> + */
>> +const char *regs_query_register_name(unsigned int offset)
>> +{
>> + const struct pt_regs_offset *roff;
>> +
>> + for (roff = regoffset_table; roff->name != NULL; roff++)
>> + if (roff->offset == offset)
>> + return roff->name;
>> + return NULL;
>> +}
>> +
>> +/**
>> + * regs_within_kernel_stack() - check the address in the stack
>> + * @regs: pt_regs which contains kernel stack pointer.
>> + * @addr: address which is checked.
>> + *
>> + * regs_within_kernel_stack() checks @addr is within the kernel stack page(s).
>> + * If @addr is within the kernel stack, it returns true. If not, returns false.
>> + */
>> +bool regs_within_kernel_stack(struct pt_regs *regs, unsigned long addr)
>> +{
>> + return ((addr & ~(THREAD_SIZE - 1)) ==
>> + (kernel_stack_pointer(regs) & ~(THREAD_SIZE - 1)));
>> +}
>> +
>> +/**
>> + * regs_get_kernel_stack_nth() - get Nth entry of the stack
>> + * @regs: pt_regs which contains kernel stack pointer.
>> + * @n: stack entry number.
>> + *
>> + * regs_get_kernel_stack_nth() returns @n th entry of the kernel stack which
>> + * is specified by @regs. If the @n th entry is NOT in the kernel stack,
>> + * this returns 0.
>> + */
>> +unsigned long regs_get_kernel_stack_nth(struct pt_regs *regs, unsigned int n)
>> +{
>> + unsigned long *addr = (unsigned long *)kernel_stack_pointer(regs);
>> +
>> + addr += n;
>> + if (regs_within_kernel_stack(regs, (unsigned long)addr))
>> + return *addr;
>> + else
>> + return 0;
>> +}
>> +
>> /*
>> * TODO: does not yet catch signals sent when the child dies.
>> * in exit.c or in signal.c.
>>
>
> Thanks,
>
> M.
>
Thanks for your feedback.
-dl
On 03/14/2016 03:38 AM, Marc Zyngier wrote:
> On Mon, 14 Mar 2016 09:34:55 +0530
> Pratyush Anand <[email protected]> wrote:
>
> Hi Pratyush,
>
>> On 13/03/2016:12:09:03 PM, Marc Zyngier wrote:
>>> On Wed, 9 Mar 2016 00:32:18 -0500
>>> David Long <[email protected]> wrote:
>>>
>>>> +pstate_check_t * const opcode_condition_checks[16] = {
>>>> + __check_eq, __check_ne, __check_cs, __check_cc,
>>>> + __check_mi, __check_pl, __check_vs, __check_vc,
>>>> + __check_hi, __check_ls, __check_ge, __check_lt,
>>>> + __check_gt, __check_le, __check_al, __check_al
>>>
>>> The very last entry seems wrong, or is at least the opposite of what
>>> the current code has. It should be something called __check_nv(), and
>>> always return false (condition code NEVER).
>>
>> May be __check_nv() name is more appropriate as per definition, but shouldn't it
>> still return true, because TRM says:
>> "The condition code NV exists only to provide a valid disassembly of the 0b1111
>> encoding, otherwise its behavior is identical to AL"
>
> Indeed, I missed that. But this interpretation is for the A64
> instruction set, and this array is also used by the new
> arm32_check_condition. The condition code table for A32 seems to
> completely ignore the 0b1111 code (there is simply no entry for it), and
> it is only in the ConditionHolds pseudocode that you can see how this
> is actually special-cased.
>
> So I'm fine leaving the code as it is, but a comment and a pointer to
> the ARMv8 ARM wouldn't go amiss.
>
> Thanks,
>
> M.
>
OK.
-dl
On 03/11/2016 10:56 PM, Marc Zyngier wrote:
> On Wed, 9 Mar 2016 00:32:20 -0500
> David Long <[email protected]> wrote:
>
> David,
>
>> From: Sandeepa Prabhu <[email protected]>
>>
>> Kprobes needs simulation of instructions that cannot be stepped
>> from a different memory location, e.g.: those instructions
>> that uses PC-relative addressing. In simulation, the behaviour
>> of the instruction is implemented using a copy of pt_regs.
>>
>> The following instruction categories are simulated:
>> - All branching instructions(conditional, register, and immediate)
>> - Literal access instructions(load-literal, adr/adrp)
>>
>> Conditional execution is limited to branching instructions in
>> ARM v8. If conditions at PSTATE do not match the condition fields
>> of opcode, the instruction is effectively NOP.
>>
>> Thanks to Will Cohen for assorted suggested changes.
>>
>> Signed-off-by: Sandeepa Prabhu <[email protected]>
>> Signed-off-by: William Cohen <[email protected]>
>> Signed-off-by: David A. Long <[email protected]>
>> ---
>> arch/arm64/include/asm/insn.h | 1 +
>> arch/arm64/include/asm/probes.h | 5 +-
>> arch/arm64/kernel/Makefile | 3 +-
>> arch/arm64/kernel/insn.c | 1 +
>> arch/arm64/kernel/kprobes-arm64.c | 29 ++++
>> arch/arm64/kernel/kprobes.c | 32 ++++-
>> arch/arm64/kernel/probes-simulate-insn.c | 218 +++++++++++++++++++++++++++++++
>> arch/arm64/kernel/probes-simulate-insn.h | 28 ++++
>> 8 files changed, 311 insertions(+), 6 deletions(-)
>> create mode 100644 arch/arm64/kernel/probes-simulate-insn.c
>> create mode 100644 arch/arm64/kernel/probes-simulate-insn.h
>>
>> diff --git a/arch/arm64/include/asm/insn.h b/arch/arm64/include/asm/insn.h
>> index b9567a1..26cee10 100644
>> --- a/arch/arm64/include/asm/insn.h
>> +++ b/arch/arm64/include/asm/insn.h
>> @@ -410,6 +410,7 @@ u32 aarch32_insn_mcr_extract_crm(u32 insn);
>>
>> typedef bool (pstate_check_t)(unsigned long);
>> extern pstate_check_t * const opcode_condition_checks[16];
>> +
>> #endif /* __ASSEMBLY__ */
>>
>> #endif /* __ASM_INSN_H */
>> diff --git a/arch/arm64/include/asm/probes.h b/arch/arm64/include/asm/probes.h
>> index c5fcbe6..d524f7d 100644
>> --- a/arch/arm64/include/asm/probes.h
>> +++ b/arch/arm64/include/asm/probes.h
>> @@ -15,11 +15,12 @@
>> #ifndef _ARM_PROBES_H
>> #define _ARM_PROBES_H
>>
>> +#include <asm/opcodes.h>
>> +
>> struct kprobe;
>> struct arch_specific_insn;
>>
>> typedef u32 kprobe_opcode_t;
>> -typedef unsigned long (kprobes_pstate_check_t)(unsigned long);
>> typedef void (kprobes_handler_t) (u32 opcode, long addr, struct pt_regs *);
>>
>> enum pc_restore_type {
>> @@ -35,7 +36,7 @@ struct kprobe_pc_restore {
>> /* architecture specific copy of original instruction */
>> struct arch_specific_insn {
>> kprobe_opcode_t *insn;
>> - kprobes_pstate_check_t *pstate_cc;
>> + pstate_check_t *pstate_cc;
>> kprobes_handler_t *handler;
>> /* restore address after step xol */
>> struct kprobe_pc_restore restore;
>> diff --git a/arch/arm64/kernel/Makefile b/arch/arm64/kernel/Makefile
>> index 4efb791..08325e5 100644
>> --- a/arch/arm64/kernel/Makefile
>> +++ b/arch/arm64/kernel/Makefile
>> @@ -36,7 +36,8 @@ arm64-obj-$(CONFIG_CPU_PM) += sleep.o suspend.o
>> arm64-obj-$(CONFIG_CPU_IDLE) += cpuidle.o
>> arm64-obj-$(CONFIG_JUMP_LABEL) += jump_label.o
>> arm64-obj-$(CONFIG_KGDB) += kgdb.o
>> -arm64-obj-$(CONFIG_KPROBES) += kprobes.o kprobes-arm64.o
>> +arm64-obj-$(CONFIG_KPROBES) += kprobes.o kprobes-arm64.o \
>> + probes-simulate-insn.o
>> arm64-obj-$(CONFIG_EFI) += efi.o efi-entry.stub.o
>> arm64-obj-$(CONFIG_PCI) += pci.o
>> arm64-obj-$(CONFIG_ARMV8_DEPRECATED) += armv8_deprecated.o
>> diff --git a/arch/arm64/kernel/insn.c b/arch/arm64/kernel/insn.c
>> index 9f15ceb..f9a3432 100644
>> --- a/arch/arm64/kernel/insn.c
>> +++ b/arch/arm64/kernel/insn.c
>> @@ -30,6 +30,7 @@
>> #include <asm/cacheflush.h>
>> #include <asm/debug-monitors.h>
>> #include <asm/fixmap.h>
>> +#include <asm/opcodes.h>
>> #include <asm/insn.h>
>>
>> #define AARCH64_INSN_SF_BIT BIT(31)
>> diff --git a/arch/arm64/kernel/kprobes-arm64.c b/arch/arm64/kernel/kprobes-arm64.c
>> index e07727a..487238a 100644
>> --- a/arch/arm64/kernel/kprobes-arm64.c
>> +++ b/arch/arm64/kernel/kprobes-arm64.c
>> @@ -21,6 +21,7 @@
>> #include <asm/sections.h>
>>
>> #include "kprobes-arm64.h"
>> +#include "probes-simulate-insn.h"
>>
>> static bool __kprobes aarch64_insn_is_steppable(u32 insn)
>> {
>> @@ -62,8 +63,36 @@ arm_probe_decode_insn(kprobe_opcode_t insn, struct arch_specific_insn *asi)
>> */
>> if (aarch64_insn_is_steppable(insn))
>> return INSN_GOOD;
>> +
>> + if (aarch64_insn_is_bcond(insn)) {
>> + asi->handler = simulate_b_cond;
>> + } else if (aarch64_insn_is_cbz(insn) ||
>> + aarch64_insn_is_cbnz(insn)) {
>> + asi->handler = simulate_cbz_cbnz;
>> + } else if (aarch64_insn_is_tbz(insn) ||
>> + aarch64_insn_is_tbnz(insn)) {
>> + asi->handler = simulate_tbz_tbnz;
>> + } else if (aarch64_insn_is_adr_adrp(insn))
>> + asi->handler = simulate_adr_adrp;
>> + else if (aarch64_insn_is_b(insn) ||
>> + aarch64_insn_is_bl(insn))
>> + asi->handler = simulate_b_bl;
>> + else if (aarch64_insn_is_br(insn) ||
>> + aarch64_insn_is_blr(insn) ||
>> + aarch64_insn_is_ret(insn))
>> + asi->handler = simulate_br_blr_ret;
>> + else if (aarch64_insn_is_ldr_lit(insn))
>> + asi->handler = simulate_ldr_literal;
>> + else if (aarch64_insn_is_ldrsw_lit(insn))
>> + asi->handler = simulate_ldrsw_literal;
>> else
>> + /*
>> + * Instruction cannot be stepped out-of-line and we don't
>> + * (yet) simulate it.
>> + */
>> return INSN_REJECTED;
>> +
>> + return INSN_GOOD_NO_SLOT;
>> }
>>
>> static bool __kprobes
>> diff --git a/arch/arm64/kernel/kprobes.c b/arch/arm64/kernel/kprobes.c
>> index e72dbce..ffc5affd 100644
>> --- a/arch/arm64/kernel/kprobes.c
>> +++ b/arch/arm64/kernel/kprobes.c
>> @@ -40,6 +40,9 @@ void jprobe_return_break(void);
>> DEFINE_PER_CPU(struct kprobe *, current_kprobe) = NULL;
>> DEFINE_PER_CPU(struct kprobe_ctlblk, kprobe_ctlblk);
>>
>> +static void __kprobes
>> +post_kprobe_handler(struct kprobe_ctlblk *, struct pt_regs *);
>> +
>> static void __kprobes arch_prepare_ss_slot(struct kprobe *p)
>> {
>> /* prepare insn slot */
>> @@ -57,6 +60,24 @@ static void __kprobes arch_prepare_ss_slot(struct kprobe *p)
>> p->ainsn.restore.type = RESTORE_PC;
>> }
>>
>> +static void __kprobes arch_prepare_simulate(struct kprobe *p)
>> +{
>> + /* This instructions is not executed xol. No need to adjust the PC */
>> + p->ainsn.restore.addr = 0;
>> + p->ainsn.restore.type = NO_RESTORE;
>> +}
>> +
>> +static void __kprobes arch_simulate_insn(struct kprobe *p, struct pt_regs *regs)
>> +{
>> + struct kprobe_ctlblk *kcb = get_kprobe_ctlblk();
>> +
>> + if (p->ainsn.handler)
>> + p->ainsn.handler((u32)p->opcode, (long)p->addr, regs);
>> +
>> + /* single step simulated, now go for post processing */
>> + post_kprobe_handler(kcb, regs);
>> +}
>> +
>> int __kprobes arch_prepare_kprobe(struct kprobe *p)
>> {
>> unsigned long probe_addr = (unsigned long)p->addr;
>> @@ -73,7 +94,8 @@ int __kprobes arch_prepare_kprobe(struct kprobe *p)
>> return -EINVAL;
>>
>> case INSN_GOOD_NO_SLOT: /* insn need simulation */
>> - return -EINVAL;
>> + p->ainsn.insn = NULL;
>> + break;
>>
>> case INSN_GOOD: /* instruction uses slot */
>> p->ainsn.insn = get_insn_slot();
>> @@ -83,7 +105,10 @@ int __kprobes arch_prepare_kprobe(struct kprobe *p)
>> };
>>
>> /* prepare the instruction */
>> - arch_prepare_ss_slot(p);
>> + if (p->ainsn.insn)
>> + arch_prepare_ss_slot(p);
>> + else
>> + arch_prepare_simulate(p);
>>
>> return 0;
>> }
>> @@ -225,7 +250,8 @@ static void __kprobes setup_singlestep(struct kprobe *p,
>> kernel_enable_single_step(regs);
>> instruction_pointer(regs) = slot;
>> } else {
>> - BUG();
>> + /* insn simulation */
>> + arch_simulate_insn(p, regs);
>> }
>> }
>>
>> diff --git a/arch/arm64/kernel/probes-simulate-insn.c b/arch/arm64/kernel/probes-simulate-insn.c
>> new file mode 100644
>> index 0000000..94333a6
>> --- /dev/null
>> +++ b/arch/arm64/kernel/probes-simulate-insn.c
>> @@ -0,0 +1,218 @@
>> +/*
>> + * arch/arm64/kernel/probes-simulate-insn.c
>> + *
>> + * Copyright (C) 2013 Linaro Limited.
>> + *
>> + * This program is free software; you can redistribute it and/or modify
>> + * it under the terms of the GNU General Public License version 2 as
>> + * published by the Free Software Foundation.
>> + *
>> + * This program is distributed in the hope that it will be useful,
>> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
>> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
>> + * General Public License for more details.
>> + */
>> +
>> +#include <linux/kernel.h>
>> +#include <linux/kprobes.h>
>> +#include <linux/module.h>
>> +
>> +#include "probes-simulate-insn.h"
>> +
>> +#define sign_extend(x, signbit) \
>> + ((x) | (0 - ((x) & (1 << (signbit)))))
>> +
>> +#define bbl_displacement(insn) \
>> + sign_extend(((insn) & 0x3ffffff) << 2, 27)
>> +
>> +#define bcond_displacement(insn) \
>> + sign_extend(((insn >> 5) & 0x7ffff) << 2, 20)
>> +
>> +#define cbz_displacement(insn) \
>> + sign_extend(((insn >> 5) & 0x7ffff) << 2, 20)
>> +
>> +#define tbz_displacement(insn) \
>> + sign_extend(((insn >> 5) & 0x3fff) << 2, 15)
>> +
>> +#define ldr_displacement(insn) \
>> + sign_extend(((insn >> 5) & 0x7ffff) << 2, 20)
>> +
>> +static inline void set_x_reg(struct pt_regs *regs, int reg, u64 val)
>> +{
>> + if (reg < 31)
>> + regs->regs[reg] = val;
>> +}
>> +
>> +static inline void set_w_reg(struct pt_regs *regs, int reg, u64 val)
>> +{
>> + if (reg < 31)
>> + *(u32 *) (®s->regs[reg]) = val;
>
> I'm afraid this is subtly buggy. A "ldr w0, =value" will write the
> entire register, clearing the top 32 bits. Here, you're only writing
> the bottom 32bits (not to mention that this looks completely broken on
> BE).
>
> A much better way of writing this would be:
>
> regs->regs[reg] = lower_32_bit(val);
>
OK, that looks clear enough.
>> +}
>> +
>> +static inline u64 get_x_reg(struct pt_regs *regs, int reg)
>> +{
>> + if (reg < 31)
>> + return regs->regs[reg];
>> + else
>> + return 0;
>> +}
>> +
>> +static inline u32 get_w_reg(struct pt_regs *regs, int reg)
>> +{
>> + if (reg < 31)
>> + return regs->regs[reg] & 0xffffffff;
>
> return lower_32_bit(regs->regs[reg]);
>
Right.
>> + else
>> + return 0;
>> +}
>> +
>> +static bool __kprobes check_cbz(u32 opcode, struct pt_regs *regs)
>> +{
>> + int xn = opcode & 0x1f;
>> +
>> + return (opcode & (1 << 31)) ?
>> + (get_x_reg(regs, xn) == 0) : (get_w_reg(regs, xn) == 0);
>> +}
>> +
>> +static bool __kprobes check_cbnz(u32 opcode, struct pt_regs *regs)
>> +{
>> + int xn = opcode & 0x1f;
>> +
>> + return (opcode & (1 << 31)) ?
>> + (get_x_reg(regs, xn) != 0) : (get_w_reg(regs, xn) != 0);
>> +}
>> +
>> +static bool __kprobes check_tbz(u32 opcode, struct pt_regs *regs)
>> +{
>> + int xn = opcode & 0x1f;
>> + int bit_pos = ((opcode & (1 << 31)) >> 26) | ((opcode >> 19) & 0x1f);
>> +
>> + return ((get_x_reg(regs, xn) >> bit_pos) & 0x1) == 0;
>> +}
>> +
>> +static bool __kprobes check_tbnz(u32 opcode, struct pt_regs *regs)
>> +{
>> + int xn = opcode & 0x1f;
>> + int bit_pos = ((opcode & (1 << 31)) >> 26) | ((opcode >> 19) & 0x1f);
>> +
>> + return ((get_x_reg(regs, xn) >> bit_pos) & 0x1) != 0;
>> +}
>> +
>> +/*
>> + * instruction simulation functions
>> + */
>> +void __kprobes
>> +simulate_adr_adrp(u32 opcode, long addr, struct pt_regs *regs)
>> +{
>> + long imm, xn, val;
>> +
>> + xn = opcode & 0x1f;
>> + imm = ((opcode >> 3) & 0x1ffffc) | ((opcode >> 29) & 0x3);
>> + imm = sign_extend(imm, 20);
>> + if (opcode & 0x80000000)
>> + val = (imm<<12) + (addr & 0xfffffffffffff000);
>> + else
>> + val = imm + addr;
>> +
>> + set_x_reg(regs, xn, val);
>> +
>> + instruction_pointer(regs) += 4;
>> +}
>> +
>> +void __kprobes
>> +simulate_b_bl(u32 opcode, long addr, struct pt_regs *regs)
>> +{
>> + int disp = bbl_displacement(opcode);
>> +
>> + /* Link register is x30 */
>> + if (opcode & (1 << 31))
>> + set_x_reg(regs, 30, addr + 4);
>> +
>> + instruction_pointer(regs) = addr + disp;
>> +}
>> +
>> +void __kprobes
>> +simulate_b_cond(u32 opcode, long addr, struct pt_regs *regs)
>> +{
>> + int disp = 4;
>> +
>> + if (opcode_condition_checks[opcode & 0xf](regs->pstate & 0xffffffff))
>> + disp = bcond_displacement(opcode);
>> +
>> + instruction_pointer(regs) = addr + disp;
>> +}
>> +
>> +void __kprobes
>> +simulate_br_blr_ret(u32 opcode, long addr, struct pt_regs *regs)
>> +{
>> + int xn = (opcode >> 5) & 0x1f;
>> +
>> + /* update pc first in case we're doing a "blr lr" */
>> + instruction_pointer(regs) = get_x_reg(regs, xn);
>> +
>> + /* Link register is x30 */
>> + if (((opcode >> 21) & 0x3) == 1)
>> + set_x_reg(regs, 30, addr + 4);
>> +}
>> +
>> +void __kprobes
>> +simulate_cbz_cbnz(u32 opcode, long addr, struct pt_regs *regs)
>> +{
>> + int disp = 4;
>> +
>> + if (opcode & (1 << 24)) {
>> + if (check_cbnz(opcode, regs))
>> + disp = cbz_displacement(opcode);
>> + } else {
>> + if (check_cbz(opcode, regs))
>> + disp = cbz_displacement(opcode);
>> + }
>> + instruction_pointer(regs) = addr + disp;
>> +}
>> +
>> +void __kprobes
>> +simulate_tbz_tbnz(u32 opcode, long addr, struct pt_regs *regs)
>> +{
>> + int disp = 4;
>> +
>> + if (opcode & (1 << 24)) {
>> + if (check_tbnz(opcode, regs))
>> + disp = tbz_displacement(opcode);
>> + } else {
>> + if (check_tbz(opcode, regs))
>> + disp = tbz_displacement(opcode);
>> + }
>> + instruction_pointer(regs) = addr + disp;
>> +}
>> +
>> +void __kprobes
>> +simulate_ldr_literal(u32 opcode, long addr, struct pt_regs *regs)
>> +{
>> + u64 *load_addr;
>> + int xn = opcode & 0x1f;
>> + int disp;
>> +
>> + disp = ldr_displacement(opcode);
>> + load_addr = (u64 *) (addr + disp);
>> +
>> + if (opcode & (1 << 30)) /* x0-x30 */
>> + set_x_reg(regs, xn, *load_addr);
>> + else /* w0-w30 */
>> + set_w_reg(regs, xn, (*(u32 *) (load_addr)));
>
> If you're passing a u32 to set_w_reg(), why is the prototype taking a
> u64?
>
Oops, will fix that.
>> +
>> + instruction_pointer(regs) += 4;
>> +}
>> +
>> +void __kprobes
>> +simulate_ldrsw_literal(u32 opcode, long addr, struct pt_regs *regs)
>> +{
>> + s32 *load_addr;
>> + int xn = opcode & 0x1f;
>> + int disp;
>> +
>> + disp = ldr_displacement(opcode);
>> + load_addr = (s32 *) (addr + disp);
>> +
>> + set_x_reg(regs, xn, *load_addr);
>> +
>> + instruction_pointer(regs) += 4;
>> +}
>> diff --git a/arch/arm64/kernel/probes-simulate-insn.h b/arch/arm64/kernel/probes-simulate-insn.h
>> new file mode 100644
>> index 0000000..d6bb9a5
>> --- /dev/null
>> +++ b/arch/arm64/kernel/probes-simulate-insn.h
>> @@ -0,0 +1,28 @@
>> +/*
>> + * arch/arm64/kernel/probes-simulate-insn.h
>> + *
>> + * Copyright (C) 2013 Linaro Limited
>> + *
>> + * This program is free software; you can redistribute it and/or modify
>> + * it under the terms of the GNU General Public License version 2 as
>> + * published by the Free Software Foundation.
>> + *
>> + * This program is distributed in the hope that it will be useful,
>> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
>> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
>> + * General Public License for more details.
>> + */
>> +
>> +#ifndef _ARM_KERNEL_PROBES_SIMULATE_INSN_H
>> +#define _ARM_KERNEL_PROBES_SIMULATE_INSN_H
>> +
>> +void simulate_adr_adrp(u32 opcode, long addr, struct pt_regs *regs);
>> +void simulate_b_bl(u32 opcode, long addr, struct pt_regs *regs);
>> +void simulate_b_cond(u32 opcode, long addr, struct pt_regs *regs);
>> +void simulate_br_blr_ret(u32 opcode, long addr, struct pt_regs *regs);
>> +void simulate_cbz_cbnz(u32 opcode, long addr, struct pt_regs *regs);
>> +void simulate_tbz_tbnz(u32 opcode, long addr, struct pt_regs *regs);
>> +void simulate_ldr_literal(u32 opcode, long addr, struct pt_regs *regs);
>> +void simulate_ldrsw_literal(u32 opcode, long addr, struct pt_regs *regs);
>> +
>> +#endif /* _ARM_KERNEL_PROBES_SIMULATE_INSN_H */
>
>
> Thanks,
>
> M.
>
On 03/13/2016 09:52 AM, Marc Zyngier wrote:
> On Wed, 9 Mar 2016 00:32:21 -0500
> David Long <[email protected]> wrote:
>
> David,
>
> I remember looking at that code over your shoulder whilst at Connect
> last week, but I clearly wasn't running on all cylinders, because there
> is a few gotchas here - see below.
>
>> From: William Cohen <[email protected]>
>>
>> The trampoline code is used by kretprobes to capture a return from a probed
>> function. This is done by saving the registers, calling the handler, and
>> restoring the registers. The code then returns to the original saved caller
>> return address. It is necessary to do this directly instead of using a
>> software breakpoint because the code used in processing that breakpoint
>> could itself be kprobe'd and cause a problematic reentry into the debug
>> exception handler.
>>
>> Signed-off-by: William Cohen <[email protected]>
>> Signed-off-by: David A. Long <[email protected]>
>> ---
>> arch/arm64/include/asm/kprobes.h | 2 +
>> arch/arm64/kernel/Makefile | 1 +
>> arch/arm64/kernel/asm-offsets.c | 11 +++++
>> arch/arm64/kernel/kprobes.c | 5 ++
>> arch/arm64/kernel/kprobes_trampoline.S | 88 ++++++++++++++++++++++++++++++++++
>> 5 files changed, 107 insertions(+)
>> create mode 100644 arch/arm64/kernel/kprobes_trampoline.S
>>
>> diff --git a/arch/arm64/include/asm/kprobes.h b/arch/arm64/include/asm/kprobes.h
>> index 79c9511..61b4915 100644
>> --- a/arch/arm64/include/asm/kprobes.h
>> +++ b/arch/arm64/include/asm/kprobes.h
>> @@ -56,5 +56,7 @@ int kprobe_exceptions_notify(struct notifier_block *self,
>> unsigned long val, void *data);
>> int kprobe_breakpoint_handler(struct pt_regs *regs, unsigned int esr);
>> int kprobe_single_step_handler(struct pt_regs *regs, unsigned int esr);
>> +void kretprobe_trampoline(void);
>> +void __kprobes *trampoline_probe_handler(struct pt_regs *regs);
>>
>> #endif /* _ARM_KPROBES_H */
>> diff --git a/arch/arm64/kernel/Makefile b/arch/arm64/kernel/Makefile
>> index 08325e5..f192b7d 100644
>> --- a/arch/arm64/kernel/Makefile
>> +++ b/arch/arm64/kernel/Makefile
>> @@ -37,6 +37,7 @@ arm64-obj-$(CONFIG_CPU_IDLE) += cpuidle.o
>> arm64-obj-$(CONFIG_JUMP_LABEL) += jump_label.o
>> arm64-obj-$(CONFIG_KGDB) += kgdb.o
>> arm64-obj-$(CONFIG_KPROBES) += kprobes.o kprobes-arm64.o \
>> + kprobes_trampoline.o \
>> probes-simulate-insn.o
>> arm64-obj-$(CONFIG_EFI) += efi.o efi-entry.stub.o
>> arm64-obj-$(CONFIG_PCI) += pci.o
>> diff --git a/arch/arm64/kernel/asm-offsets.c b/arch/arm64/kernel/asm-offsets.c
>> index fffa4ac6..f7cc8ce 100644
>> --- a/arch/arm64/kernel/asm-offsets.c
>> +++ b/arch/arm64/kernel/asm-offsets.c
>> @@ -50,6 +50,17 @@ int main(void)
>> DEFINE(S_X5, offsetof(struct pt_regs, regs[5]));
>> DEFINE(S_X6, offsetof(struct pt_regs, regs[6]));
>> DEFINE(S_X7, offsetof(struct pt_regs, regs[7]));
>> + DEFINE(S_X8, offsetof(struct pt_regs, regs[8]));
>> + DEFINE(S_X10, offsetof(struct pt_regs, regs[10]));
>> + DEFINE(S_X12, offsetof(struct pt_regs, regs[12]));
>> + DEFINE(S_X14, offsetof(struct pt_regs, regs[14]));
>> + DEFINE(S_X16, offsetof(struct pt_regs, regs[16]));
>> + DEFINE(S_X18, offsetof(struct pt_regs, regs[18]));
>> + DEFINE(S_X20, offsetof(struct pt_regs, regs[20]));
>> + DEFINE(S_X22, offsetof(struct pt_regs, regs[22]));
>> + DEFINE(S_X24, offsetof(struct pt_regs, regs[24]));
>> + DEFINE(S_X26, offsetof(struct pt_regs, regs[26]));
>> + DEFINE(S_X28, offsetof(struct pt_regs, regs[28]));
>> DEFINE(S_LR, offsetof(struct pt_regs, regs[30]));
>> DEFINE(S_SP, offsetof(struct pt_regs, sp));
>> #ifdef CONFIG_COMPAT
>> diff --git a/arch/arm64/kernel/kprobes.c b/arch/arm64/kernel/kprobes.c
>> index ffc5affd..bd3f233 100644
>> --- a/arch/arm64/kernel/kprobes.c
>> +++ b/arch/arm64/kernel/kprobes.c
>> @@ -532,6 +532,11 @@ int __kprobes longjmp_break_handler(struct kprobe *p, struct pt_regs *regs)
>> return 1;
>> }
>>
>> +void __kprobes __used *trampoline_probe_handler(struct pt_regs *regs)
>> +{
>> + return NULL;
>> +}
>> +
>> int __init arch_init_kprobes(void)
>> {
>> return 0;
>> diff --git a/arch/arm64/kernel/kprobes_trampoline.S b/arch/arm64/kernel/kprobes_trampoline.S
>> new file mode 100644
>> index 0000000..072b4e5
>> --- /dev/null
>> +++ b/arch/arm64/kernel/kprobes_trampoline.S
>> @@ -0,0 +1,88 @@
>> +/*
>> + * trampoline entry and return code for kretprobes.
>> + */
>> +
>> +#include <linux/linkage.h>
>> +#include <asm/asm-offsets.h>
>> +#include <asm/assembler.h>
>> +
>> + .text
>> +
>> +.macro save_all_base_regs ctxt
>> + stp x0, x1, [\ctxt, #S_X0]
>> + stp x2, x3, [\ctxt, #S_X2]
>> + stp x4, x5, [\ctxt, #S_X4]
>> + stp x6, x7, [\ctxt, #S_X6]
>> + stp x8, x9, [\ctxt, #S_X8]
>> + stp x10, x11, [\ctxt, #S_X10]
>> + stp x12, x13, [\ctxt, #S_X12]
>> + stp x14, x15, [\ctxt, #S_X14]
>> + stp x16, x17, [\ctxt, #S_X16]
>> + stp x18, x19, [\ctxt, #S_X18]
>> + stp x20, x21, [\ctxt, #S_X20]
>> + stp x22, x23, [\ctxt, #S_X22]
>> + stp x24, x25, [\ctxt, #S_X24]
>> + stp x26, x27, [\ctxt, #S_X26]
>> + stp x28, x29, [\ctxt, #S_X28]
>> + str lr, [\ctxt, #S_LR]
>> + add x0, \ctxt, #S_FRAME_SIZE
>> + str x0, [\ctxt, #S_SP]
>
> Nit: this could also be rewritten as:
>
> add x0, \ctxt, #S_FRAME_SIZE
> stp lr, xo, [\ctxt, #S_LR]
>
> Another thing worth noting is that since your macro saves all the
> GP registers, only SP can be used for the ctxt parameter. This means
> you're better off hardcoding SP in this macro, and not give the
> illusion of being generic.
>
OK.
>> +/*
>> + * Construct a useful saved PSTATE
>> + */
>> + mrs x0, nzcv
>> + and x0, x0, #0xf0000000
>
> It'd be worth spelling this as (PSR_N_BIT | PSR_Z_BIT | PSR_C_BIT |
> PSR_V_BIT)...
>
OK.
>> + mrs x1, daif
>> + and x1, x1, #0x3c0
>
> ... and this as (PSR_D_BIT | PSR_A_BIT | PSR_I_BIT | PSR_F_BIT).
>
OK.
>> + orr x0, x0, x1
>> + mrs x1, CurrentEL
>> + and x1, x1, #12
>
> I'd like this '12' to be a bit more explicit. How about (3 << 2)? I
> tend to parse this kind of things much more easily...
>
OK.
>> + lsl x1, x1, #21
>
> What's that shift for? The various bits should be at the right place
> already.
>
At the time it made sense, but rereading the ARM ARM it's not making so
much sense.
>> + orr x0, x1, x0
>> + mrs x1, SPSel
>> + and x1, x1, #1
>> + lsl x1, x1, #21
>
> Same here (you're in single-step territory, which is probably not what
> you want...).
>
See previous comment.
>> + orr x0, x1, x0
>> + str x0, [\ctxt, #S_PSTATE]
>> +.endm
>> +
>> +.macro restore_all_base_regs ctxt
>
> Same remark about the pseudo-generic parameter.
>
OK.
>> + ldr x0, [\ctxt, #S_PSTATE]
>> + and x0, x0, #0xf0000000
>
> Same remark about using the PSR_* macros.
>
OK.
>> + msr nzcv, x0
>> + ldp x0, x1, [\ctxt, #S_X0]
>> + ldp x2, x3, [\ctxt, #S_X2]
>> + ldp x4, x5, [\ctxt, #S_X4]
>> + ldp x6, x7, [\ctxt, #S_X6]
>> + ldp x8, x9, [\ctxt, #S_X8]
>> + ldp x10, x11, [\ctxt, #S_X10]
>> + ldp x12, x13, [\ctxt, #S_X12]
>> + ldp x14, x15, [\ctxt, #S_X14]
>> + ldp x16, x17, [\ctxt, #S_X16]
>> + ldp x18, x19, [\ctxt, #S_X18]
>> + ldp x20, x21, [\ctxt, #S_X20]
>> + ldp x22, x23, [\ctxt, #S_X22]
>> + ldp x24, x25, [\ctxt, #S_X24]
>> + ldp x26, x27, [\ctxt, #S_X26]
>> + ldp x28, x29, [\ctxt, #S_X28]
>> +.endm
>> +
>> +ENTRY(kretprobe_trampoline)
>> +
>> + sub sp, sp, #S_FRAME_SIZE
>> +
>> + save_all_base_regs sp
>> +
>> + mov x0, sp
>> + bl trampoline_probe_handler
>> + /* Replace trampoline address in lr with actual
>> + orig_ret_addr return address. */
>> + mov lr, x0
>> +
>> + restore_all_base_regs sp
>> +
>> + add sp, sp, #S_FRAME_SIZE
>> +
>> + ret
>> +
>> +ENDPROC(kretprobe_trampoline)
>
> Thanks,
>
> M.
>
On 03/17/2016 08:58 AM, ƽ?????? / HIRAMATU??MASAMI wrote:
>> From: ƽ?????? / HIRAMATU??MASAMI [mailto:[email protected]]
>>
>> Hi,
>>
>>> From: Sandeepa Prabhu <[email protected]>
>>>
>>> The pre-handler of this special 'trampoline' kprobe executes the return
>>> probe handler functions and restores original return address in ELR_EL1.
>>> This way the saved pt_regs still hold the original register context to be
>>> carried back to the probed kernel function.
>>
>> This patch seems not well separated.
>>
>>> diff --git a/arch/arm64/kernel/kprobes.c b/arch/arm64/kernel/kprobes.c
>>> index bd3f233..13d3333 100644
>>> --- a/arch/arm64/kernel/kprobes.c
>>> +++ b/arch/arm64/kernel/kprobes.c
>>
>> [snip]
>>
>>> +void __kprobes arch_prepare_kretprobe(struct kretprobe_instance *ri,
>>> + struct pt_regs *regs)
>>> +{
>>> + ri->ret_addr = (kprobe_opcode_t *)regs->regs[30];
>>> +
>>> + /* replace return addr (x30) with trampoline */
>>> + regs->regs[30] = (long)&kretprobe_trampoline;
>>
>> So, where is the kretprobe_trampoline? It seems that function is
>> defined in other patch.
>>
>>> +}
>>> +
>>> +int __kprobes arch_trampoline_kprobe(struct kprobe *p)
>>> +{
>>> + return 0;
>>> }
>>
>> And what this function is for??
>
> Ah, sorry, this was my fault. Yes, this function is required.
> But this implementation also means there is an asm-based trampoline
> function which should be included in this patch.
>
> David, could you tell me the repository which I can get the latest
> version of this series? I'd like to see the whole code of kprobes/arm64.
>
> Thank you,
>
It can be found in:
http://git.linaro.org/people/dave.long/linux.git
...in the kprobes64-v11 branch.
Thanks,
-dl
On Fri, Mar 18, 2016 at 06:59:02PM +0530, Pratyush Anand wrote:
> On 17/03/2016:01:27:26 PM, Pratyush Anand wrote:
> > @David: This patch was added in v9 and fixup_exception() had been dropped in v9.
> > Since, dropping of fixup_exception() also caused to fail some systemtap test
> > cases, so it was added back in v10. I wonder if we really need this patch.
> > May be you can try to run related test case by dropping this patch.
>
> Had a closer look to the code, and noticed that fixup_exception() does not have
> any role in handling of page fault of copy_to_user(). Then, why do we have the
> problem.
> Probably, I can see why does not it work. So, when we are single stepping an
> instruction and page fault occurs, we will come to el1_da in entry.S. Here, we
> do enable_dbg. As soon as we will do this, we will start receiving single step
> exception after each instruction (not sure, probably for each alternate
> instruction). Since, there will not be any matching single step handler for
> these instructions, so we will see warning "Unexpected kernel single-step
> exception at EL1".
>
> So, I think, we should
>
> (1) may be do not enable debug for el1_da, or
> (2) enable_dbg only when single stepping is not enabled, or
> (3) or disable single stepping during el1_da execution.
>
> (1) will solve the issue for sure, but not sure if it could be the best choice.
>
> Will, what do you suggest?
Leaving debug exceptions disabled isn't something I'm keen on at all,
because it leads to blackspots in kernel debugging that I don't think
should be enforced by the low-level debug machinery. My preference is
for the higher-level debugger code (e.g. kprobes, kdgb) to ignore the
events that it's not interested in.
It's also very easy to lose track of the debug state if you run preemptible
code at EL1 with debug exceptions disabled, because kernel debugging is
per-cpu rather than per-task.
Will
Hi Will,
Thanks for the reply.
On 21/03/2016:02:52:43 PM, Will Deacon wrote:
> On Fri, Mar 18, 2016 at 06:59:02PM +0530, Pratyush Anand wrote:
> > On 17/03/2016:01:27:26 PM, Pratyush Anand wrote:
> > > @David: This patch was added in v9 and fixup_exception() had been dropped in v9.
> > > Since, dropping of fixup_exception() also caused to fail some systemtap test
> > > cases, so it was added back in v10. I wonder if we really need this patch.
> > > May be you can try to run related test case by dropping this patch.
> >
> > Had a closer look to the code, and noticed that fixup_exception() does not have
> > any role in handling of page fault of copy_to_user(). Then, why do we have the
> > problem.
> > Probably, I can see why does not it work. So, when we are single stepping an
> > instruction and page fault occurs, we will come to el1_da in entry.S. Here, we
> > do enable_dbg. As soon as we will do this, we will start receiving single step
> > exception after each instruction (not sure, probably for each alternate
> > instruction). Since, there will not be any matching single step handler for
> > these instructions, so we will see warning "Unexpected kernel single-step
> > exception at EL1".
> >
> > So, I think, we should
> >
> > (1) may be do not enable debug for el1_da, or
> > (2) enable_dbg only when single stepping is not enabled, or
> > (3) or disable single stepping during el1_da execution.
> >
> > (1) will solve the issue for sure, but not sure if it could be the best choice.
> >
> > Will, what do you suggest?
>
> Leaving debug exceptions disabled isn't something I'm keen on at all,
> because it leads to blackspots in kernel debugging that I don't think
> should be enforced by the low-level debug machinery. My preference is
> for the higher-level debugger code (e.g. kprobes, kdgb) to ignore the
> events that it's not interested in.
I think this is what the current implementation is, so in the given situation
higher-level debugger code ignore the single step exceptions events, which they
are not expecting.
Here, execution of single stepped instruction is causing to raise another new
exception, say data abort. Now, as soon as we enable debug exceptions while
handling this data abort we will start getting single step exceptions for all
the executed instruction of data abort handler. None of the "higher-level
debugger code" is interested in those events and so they ignore them. We keep on
getting "Unexpected kernel single-step exception at EL1" until all the
instructions for data abort handler are executed.
>
> It's also very easy to lose track of the debug state if you run preemptible
> code at EL1 with debug exceptions disabled, because kernel debugging is
> per-cpu rather than per-task.
OK.Thanks for this clarification. So, one of the way could be to set a per
cpu variable by higher level debugger code, and then check them in kernel_entry
and kernel_exit and accordingly disable/enable only single stepping. Do you
think, it would be good idea to do that?
If yes, then would adding a new u64 variable say "flags" in struct pt_regs be
acceptable?
~Pratyush
Hi David,
on 2016/3/9 13:32, David Long wrote:
> +int __kprobes arch_prepare_kprobe(struct kprobe *p)
> +{
> + unsigned long probe_addr = (unsigned long)p->addr;
Here should verify the addr alignment:
if (probe_addr & 0x3)
return -EINVAL;
Thanks,
Li Bin
> +
> + /* copy instruction */
> + p->opcode = le32_to_cpu(*p->addr);
> +
> + if (in_exception_text(probe_addr))
> + return -EINVAL;
> +
> + /* decode instruction */
> + switch (arm_kprobe_decode_insn(p->addr, &p->ainsn)) {
> + case INSN_REJECTED: /* insn not supported */
> + return -EINVAL;
> +
> + case INSN_GOOD_NO_SLOT: /* insn need simulation */
> + return -EINVAL;
> +
> + case INSN_GOOD: /* instruction uses slot */
> + p->ainsn.insn = get_insn_slot();
> + if (!p->ainsn.insn)
> + return -ENOMEM;
> + break;
> + };
> +
> + /* prepare the instruction */
> + arch_prepare_ss_slot(p);
> +
> + return 0;
> +}
> +