Hi all,
we would like to implement an architecture specific variant of "kprobes
on ftrace" without using the current HAVE_KPROBES_ON_FTRACE infrastructure
which is currently only used by x86.
The rationale for these two patches is:
- we want to patch the first instruction of the mcount code block to
reduce the overhead of the function tracer
- we'd like to keep the ftrace_caller function as simple as possible and
not require it to generate a 100% valid pt_regs structure as required
by the combination of DYNAMIC_FTRACE_WITH_REGS and HAVE_KPROBES_ON_FTRACE.
This allows us to not generate the psw mask field in the pt_regs
structure on each function tracer enabled function, which otherwise would
be very expensive. Besides that program check generated pt_regs contents
are "more" accurate than program generated ones and don't require any
maintenance.
And also we can keep the ftrace and kprobes backends quite separated.
In order to make this work a small common code change is necessary which
removes a check if kprobe is being placed on an ftrace location (see
first patch).
If possible, I'd like to have an ACK from at least one of the kprobes
maintainers for the first patch and bring it upstream via the s390 tree.
Thanks,
Heiko
Heiko Carstens (2):
kprobes: introduce ARCH_HANDLES_KPROBES_ON_FTRACE
s390/ftrace,kprobes: allow to patch first instruction
arch/Kconfig | 8 +++
arch/s390/Kconfig | 1 +
arch/s390/include/asm/ftrace.h | 52 ++++++++++++++--
arch/s390/include/asm/kprobes.h | 1 +
arch/s390/include/asm/lowcore.h | 4 +-
arch/s390/include/asm/pgtable.h | 12 ++++
arch/s390/kernel/asm-offsets.c | 1 -
arch/s390/kernel/early.c | 4 --
arch/s390/kernel/ftrace.c | 132 +++++++++++++++++++++++++---------------
arch/s390/kernel/kprobes.c | 87 ++++++++++++++++++--------
arch/s390/kernel/mcount.S | 1 +
arch/s390/kernel/setup.c | 2 -
arch/s390/kernel/smp.c | 1 -
kernel/kprobes.c | 3 +-
scripts/recordmcount.c | 2 +-
scripts/recordmcount.pl | 2 +-
16 files changed, 220 insertions(+), 93 deletions(-)
--
1.8.5.5
Allow architectures to implement handling of kprobes on function
tracer call sites on their own, without depending on common code.
This patch removes the kprobes check if a kprobe is being placed
on a function tracer call site and therefore gives full responsibility
of handling this correctly to the architecture.
Signed-off-by: Heiko Carstens <[email protected]>
---
arch/Kconfig | 8 ++++++++
kernel/kprobes.c | 3 ++-
2 files changed, 10 insertions(+), 1 deletion(-)
diff --git a/arch/Kconfig b/arch/Kconfig
index 05d7a8a458d5..e1a8e0edf03f 100644
--- a/arch/Kconfig
+++ b/arch/Kconfig
@@ -85,6 +85,14 @@ config KPROBES_ON_FTRACE
passing of pt_regs to function tracing, then kprobes can
optimize on top of function tracing.
+config ARCH_HANDLES_KPROBES_ON_FTRACE
+ def_bool n
+ help
+ If an architecture can handle kprobes on function tracer call
+ sites on own, then this option should be selected. This option
+ removes the check which otherwise prevents to set kprobes on
+ function tracer call sites.
+
config UPROBES
def_bool n
select PERCPU_RWSEM
diff --git a/kernel/kprobes.c b/kernel/kprobes.c
index 3995f546d0f3..4cc48aa67635 100644
--- a/kernel/kprobes.c
+++ b/kernel/kprobes.c
@@ -1428,7 +1428,8 @@ static int check_kprobe_address_safe(struct kprobe *p,
return -EILSEQ;
p->flags |= KPROBE_FLAG_FTRACE;
#else /* !CONFIG_KPROBES_ON_FTRACE */
- return -EINVAL;
+ if (!IS_ENABLED(CONFIG_ARCH_HANDLES_KPROBES_ON_FTRACE))
+ return -EINVAL;
#endif
}
--
1.8.5.5
If the function tracer is enabled, allow to set kprobes on the first
instruction of a function (which is the function trace caller):
If no kprobe is set handling of enabling and disabling function tracing
of a function simply patches the first instruction. Either it is a nop
(right now it's an unconditional branch, which skips the mcount block),
or it's a branch to the ftrace_caller() function.
If a kprobe is being placed on a function tracer calling instruction
we encode if we actually have a nop or branch in the remaining bytes
after the breakpoint instruction (illegal opcode).
This is possible, since the size of the instruction used for the nop
and branch is six bytes, while the size of the breakpoint is only
two bytes.
Therefore the first two bytes contain the illegal opcode and the last
four bytes contain either "0" for nop or "1" for branch. The kprobes
code will then execute/simulate the correct instruction.
Instruction patching for kprobes and function tracer is always done
with stop_machine(). Therefore we don't have any races where an
instruction is patches concurrently on a different cpu.
Besides that also the program check handler which executes the function
trace caller instruction won't be executed concurrently to any
stop_machine() execution.
This allows to keep full fault based kprobes handling which generates
correct correct pt_regs contents automatically.
Signed-off-by: Heiko Carstens <[email protected]>
---
arch/s390/Kconfig | 1 +
arch/s390/include/asm/ftrace.h | 52 ++++++++++++++--
arch/s390/include/asm/kprobes.h | 1 +
arch/s390/include/asm/lowcore.h | 4 +-
arch/s390/include/asm/pgtable.h | 12 ++++
arch/s390/kernel/asm-offsets.c | 1 -
arch/s390/kernel/early.c | 4 --
arch/s390/kernel/ftrace.c | 132 +++++++++++++++++++++++++---------------
arch/s390/kernel/kprobes.c | 87 ++++++++++++++++++--------
arch/s390/kernel/mcount.S | 1 +
arch/s390/kernel/setup.c | 2 -
arch/s390/kernel/smp.c | 1 -
scripts/recordmcount.c | 2 +-
scripts/recordmcount.pl | 2 +-
14 files changed, 210 insertions(+), 92 deletions(-)
diff --git a/arch/s390/Kconfig b/arch/s390/Kconfig
index f2cf1f90295b..a6d1c9f8ea2c 100644
--- a/arch/s390/Kconfig
+++ b/arch/s390/Kconfig
@@ -63,6 +63,7 @@ config ARCH_SUPPORTS_UPROBES
config S390
def_bool y
+ select ARCH_HANDLES_KPROBES_ON_FTRACE
select ARCH_HAS_ATOMIC64_DEC_IF_POSITIVE
select ARCH_HAS_DEBUG_STRICT_USER_COPY_CHECKS
select ARCH_HAVE_NMI_SAFE_CMPXCHG
diff --git a/arch/s390/include/asm/ftrace.h b/arch/s390/include/asm/ftrace.h
index 3aef8afec336..785041f1dc77 100644
--- a/arch/s390/include/asm/ftrace.h
+++ b/arch/s390/include/asm/ftrace.h
@@ -1,25 +1,67 @@
#ifndef _ASM_S390_FTRACE_H
#define _ASM_S390_FTRACE_H
+#define ARCH_SUPPORTS_FTRACE_OPS 1
+
+#define MCOUNT_INSN_SIZE 24
+#define MCOUNT_RETURN_FIXUP 18
+
#ifndef __ASSEMBLY__
-extern void _mcount(void);
+void _mcount(void);
+void ftrace_caller(void);
+
extern char ftrace_graph_caller_end;
+extern unsigned long ftrace_plt;
struct dyn_arch_ftrace { };
-#define MCOUNT_ADDR ((long)_mcount)
+#define MCOUNT_ADDR ((unsigned long)_mcount)
+#define FTRACE_ADDR ((unsigned long)ftrace_caller)
+#define KPROBE_ON_FTRACE_NOP 0
+#define KPROBE_ON_FTRACE_CALL 1
static inline unsigned long ftrace_call_adjust(unsigned long addr)
{
return addr;
}
-#endif /* __ASSEMBLY__ */
+struct ftrace_insn {
+ u16 opc;
+ s32 disp;
+} __packed;
-#define MCOUNT_INSN_SIZE 18
+static inline void ftrace_generate_nop_insn(struct ftrace_insn *insn)
+{
+#ifdef CONFIG_FUNCTION_TRACER
+ /* jg .+24 */
+ insn->opc = 0xc0f4;
+ insn->disp = MCOUNT_INSN_SIZE / 2;
+#endif
+}
-#define ARCH_SUPPORTS_FTRACE_OPS 1
+static inline int is_ftrace_nop(struct ftrace_insn *insn)
+{
+#ifdef CONFIG_FUNCTION_TRACER
+ if (insn->disp == MCOUNT_INSN_SIZE / 2)
+ return 1;
+#endif
+ return 0;
+}
+
+static inline void ftrace_generate_call_insn(struct ftrace_insn *insn,
+ unsigned long ip)
+{
+#ifdef CONFIG_FUNCTION_TRACER
+ unsigned long target;
+ /* brasl r0,ftrace_caller */
+ target = is_module_addr((void *) ip) ? ftrace_plt : FTRACE_ADDR;
+ insn->opc = 0xc005;
+ insn->disp = (target - ip) / 2;
+#endif
+}
+
+#endif /* __ASSEMBLY__ */
#endif /* _ASM_S390_FTRACE_H */
diff --git a/arch/s390/include/asm/kprobes.h b/arch/s390/include/asm/kprobes.h
index 98629173ce3b..b47ad3b642cc 100644
--- a/arch/s390/include/asm/kprobes.h
+++ b/arch/s390/include/asm/kprobes.h
@@ -60,6 +60,7 @@ typedef u16 kprobe_opcode_t;
struct arch_specific_insn {
/* copy of original instruction */
kprobe_opcode_t *insn;
+ unsigned int is_ftrace_insn : 1;
};
struct prev_kprobe {
diff --git a/arch/s390/include/asm/lowcore.h b/arch/s390/include/asm/lowcore.h
index 6cc51fe84410..34fbcac61133 100644
--- a/arch/s390/include/asm/lowcore.h
+++ b/arch/s390/include/asm/lowcore.h
@@ -147,7 +147,7 @@ struct _lowcore {
__u32 softirq_pending; /* 0x02ec */
__u32 percpu_offset; /* 0x02f0 */
__u32 machine_flags; /* 0x02f4 */
- __u32 ftrace_func; /* 0x02f8 */
+ __u8 pad_0x02f8[0x02fc-0x02f8]; /* 0x02f8 */
__u32 spinlock_lockval; /* 0x02fc */
__u8 pad_0x0300[0x0e00-0x0300]; /* 0x0300 */
@@ -297,7 +297,7 @@ struct _lowcore {
__u64 percpu_offset; /* 0x0378 */
__u64 vdso_per_cpu_data; /* 0x0380 */
__u64 machine_flags; /* 0x0388 */
- __u64 ftrace_func; /* 0x0390 */
+ __u8 pad_0x0390[0x0398-0x0390]; /* 0x0390 */
__u64 gmap; /* 0x0398 */
__u32 spinlock_lockval; /* 0x03a0 */
__u8 pad_0x03a0[0x0400-0x03a4]; /* 0x03a4 */
diff --git a/arch/s390/include/asm/pgtable.h b/arch/s390/include/asm/pgtable.h
index 57c882761dea..e7c7f85a9160 100644
--- a/arch/s390/include/asm/pgtable.h
+++ b/arch/s390/include/asm/pgtable.h
@@ -133,6 +133,18 @@ extern unsigned long MODULES_END;
#define MODULES_LEN (1UL << 31)
#endif
+static inline int is_module_addr(void *addr)
+{
+#ifdef CONFIG_64BIT
+ BUILD_BUG_ON(MODULES_LEN > (1UL << 31));
+ if (addr < (void *)MODULES_VADDR)
+ return 0;
+ if (addr > (void *)MODULES_END)
+ return 0;
+#endif
+ return 1;
+}
+
/*
* A 31 bit pagetable entry of S390 has following format:
* | PFRA | | OS |
diff --git a/arch/s390/kernel/asm-offsets.c b/arch/s390/kernel/asm-offsets.c
index ef279a136801..f3a78337ca86 100644
--- a/arch/s390/kernel/asm-offsets.c
+++ b/arch/s390/kernel/asm-offsets.c
@@ -156,7 +156,6 @@ int main(void)
DEFINE(__LC_INT_CLOCK, offsetof(struct _lowcore, int_clock));
DEFINE(__LC_MCCK_CLOCK, offsetof(struct _lowcore, mcck_clock));
DEFINE(__LC_MACHINE_FLAGS, offsetof(struct _lowcore, machine_flags));
- DEFINE(__LC_FTRACE_FUNC, offsetof(struct _lowcore, ftrace_func));
DEFINE(__LC_DUMP_REIPL, offsetof(struct _lowcore, ipib));
BLANK();
DEFINE(__LC_CPU_TIMER_SAVE_AREA, offsetof(struct _lowcore, cpu_timer_save_area));
diff --git a/arch/s390/kernel/early.c b/arch/s390/kernel/early.c
index cef2879edff3..302ac1f7f8e7 100644
--- a/arch/s390/kernel/early.c
+++ b/arch/s390/kernel/early.c
@@ -12,7 +12,6 @@
#include <linux/errno.h>
#include <linux/string.h>
#include <linux/ctype.h>
-#include <linux/ftrace.h>
#include <linux/lockdep.h>
#include <linux/module.h>
#include <linux/pfn.h>
@@ -490,8 +489,5 @@ void __init startup_init(void)
detect_machine_facilities();
setup_topology();
sclp_early_detect();
-#ifdef CONFIG_DYNAMIC_FTRACE
- S390_lowcore.ftrace_func = (unsigned long)ftrace_caller;
-#endif
lockdep_on();
}
diff --git a/arch/s390/kernel/ftrace.c b/arch/s390/kernel/ftrace.c
index 51d14fe5eb9a..5744d25c1d33 100644
--- a/arch/s390/kernel/ftrace.c
+++ b/arch/s390/kernel/ftrace.c
@@ -7,6 +7,7 @@
* Martin Schwidefsky <[email protected]>
*/
+#include <linux/moduleloader.h>
#include <linux/hardirq.h>
#include <linux/uaccess.h>
#include <linux/ftrace.h>
@@ -15,60 +16,39 @@
#include <linux/kprobes.h>
#include <trace/syscall.h>
#include <asm/asm-offsets.h>
+#include <asm/cacheflush.h>
#include "entry.h"
-void mcount_replace_code(void);
-void ftrace_disable_code(void);
-void ftrace_enable_insn(void);
-
/*
* The mcount code looks like this:
* stg %r14,8(%r15) # offset 0
* larl %r1,<&counter> # offset 6
* brasl %r14,_mcount # offset 12
* lg %r14,8(%r15) # offset 18
- * Total length is 24 bytes. The complete mcount block initially gets replaced
- * by ftrace_make_nop. Subsequent calls to ftrace_make_call / ftrace_make_nop
- * only patch the jg/lg instruction within the block.
- * Note: we do not patch the first instruction to an unconditional branch,
- * since that would break kprobes/jprobes. It is easier to leave the larl
- * instruction in and only modify the second instruction.
+ * Total length is 24 bytes. Only the first instruction will be patched
+ * by ftrace_make_call / ftrace_make_nop.
* The enabled ftrace code block looks like this:
- * larl %r0,.+24 # offset 0
- * > lg %r1,__LC_FTRACE_FUNC # offset 6
- * br %r1 # offset 12
- * brcl 0,0 # offset 14
- * brc 0,0 # offset 20
+ * > brasl %r0,ftrace_caller # offset 0
+ * larl %r1,<&counter> # offset 6
+ * brasl %r14,_mcount # offset 12
+ * lg %r14,8(%r15) # offset 18
* The ftrace function gets called with a non-standard C function call ABI
* where r0 contains the return address. It is also expected that the called
* function only clobbers r0 and r1, but restores r2-r15.
+ * For module code we can't directly jump to ftrace caller, but need a
+ * trampoline (ftrace_plt), which clobbers also r1.
* The return point of the ftrace function has offset 24, so execution
* continues behind the mcount block.
- * larl %r0,.+24 # offset 0
- * > jg .+18 # offset 6
- * br %r1 # offset 12
- * brcl 0,0 # offset 14
- * brc 0,0 # offset 20
+ * The disabled ftrace code block looks like this:
+ * > jg .+24 # offset 0
+ * larl %r1,<&counter> # offset 6
+ * brasl %r14,_mcount # offset 12
+ * lg %r14,8(%r15) # offset 18
* The jg instruction branches to offset 24 to skip as many instructions
* as possible.
*/
-asm(
- " .align 4\n"
- "mcount_replace_code:\n"
- " larl %r0,0f\n"
- "ftrace_disable_code:\n"
- " jg 0f\n"
- " br %r1\n"
- " brcl 0,0\n"
- " brc 0,0\n"
- "0:\n"
- " .align 4\n"
- "ftrace_enable_insn:\n"
- " lg %r1,"__stringify(__LC_FTRACE_FUNC)"\n");
-
-#define MCOUNT_BLOCK_SIZE 24
-#define MCOUNT_INSN_OFFSET 6
-#define FTRACE_INSN_SIZE 6
+
+unsigned long ftrace_plt;
int ftrace_modify_call(struct dyn_ftrace *rec, unsigned long old_addr,
unsigned long addr)
@@ -79,24 +59,62 @@ int ftrace_modify_call(struct dyn_ftrace *rec, unsigned long old_addr,
int ftrace_make_nop(struct module *mod, struct dyn_ftrace *rec,
unsigned long addr)
{
- /* Initial replacement of the whole mcount block */
- if (addr == MCOUNT_ADDR) {
- if (probe_kernel_write((void *) rec->ip - MCOUNT_INSN_OFFSET,
- mcount_replace_code,
- MCOUNT_BLOCK_SIZE))
- return -EPERM;
- return 0;
+ struct ftrace_insn insn;
+ unsigned short op;
+ void *from, *to;
+ size_t size;
+
+ ftrace_generate_nop_insn(&insn);
+ size = sizeof(insn);
+ from = &insn;
+ to = (void *) rec->ip;
+ if (probe_kernel_read(&op, (void *) rec->ip, sizeof(op)))
+ return -EFAULT;
+ /*
+ * If we find a breakpoint instruction, a kprobe has been placed
+ * at the beginning of the function. We write the constant
+ * KPROBE_ON_FTRACE_NOP into the remaining four bytes of the original
+ * instruction so that the kprobes handler can execute a nop, if it
+ * reaches this breakpoint.
+ */
+ if (op == BREAKPOINT_INSTRUCTION) {
+ size -= 2;
+ from += 2;
+ to += 2;
+ insn.disp = KPROBE_ON_FTRACE_NOP;
}
- if (probe_kernel_write((void *) rec->ip, ftrace_disable_code,
- MCOUNT_INSN_SIZE))
+ if (probe_kernel_write(to, from, size))
return -EPERM;
return 0;
}
int ftrace_make_call(struct dyn_ftrace *rec, unsigned long addr)
{
- if (probe_kernel_write((void *) rec->ip, ftrace_enable_insn,
- FTRACE_INSN_SIZE))
+ struct ftrace_insn insn;
+ unsigned short op;
+ void *from, *to;
+ size_t size;
+
+ ftrace_generate_call_insn(&insn, rec->ip);
+ size = sizeof(insn);
+ from = &insn;
+ to = (void *) rec->ip;
+ if (probe_kernel_read(&op, (void *) rec->ip, sizeof(op)))
+ return -EFAULT;
+ /*
+ * If we find a breakpoint instruction, a kprobe has been placed
+ * at the beginning of the function. We write the constant
+ * KPROBE_ON_FTRACE_CALL into the remaining four bytes of the original
+ * instruction so that the kprobes handler can execute a brasl if it
+ * reaches this breakpoint.
+ */
+ if (op == BREAKPOINT_INSTRUCTION) {
+ size -= 2;
+ from += 2;
+ to += 2;
+ insn.disp = KPROBE_ON_FTRACE_CALL;
+ }
+ if (probe_kernel_write(to, from, size))
return -EPERM;
return 0;
}
@@ -111,6 +129,24 @@ int __init ftrace_dyn_arch_init(void)
return 0;
}
+static int __init ftrace_plt_init(void)
+{
+ unsigned int *ip;
+
+ ftrace_plt = (unsigned long) module_alloc(PAGE_SIZE);
+ if (!ftrace_plt)
+ panic("cannot allocate ftrace plt\n");
+ ip = (unsigned int *) ftrace_plt;
+ ip[0] = 0x0d10e310; /* basr 1,0; lg 1,10(1); br 1 */
+ ip[1] = 0x100a0004;
+ ip[2] = 0x07f10000;
+ ip[3] = FTRACE_ADDR >> 32;
+ ip[4] = FTRACE_ADDR & 0xffffffff;
+ set_memory_ro(ftrace_plt, 1);
+ return 0;
+}
+device_initcall(ftrace_plt_init);
+
#ifdef CONFIG_FUNCTION_GRAPH_TRACER
/*
* Hook the return address and push it in the stack of return addresses
diff --git a/arch/s390/kernel/kprobes.c b/arch/s390/kernel/kprobes.c
index 014d4729b134..50e3a0aff682 100644
--- a/arch/s390/kernel/kprobes.c
+++ b/arch/s390/kernel/kprobes.c
@@ -29,6 +29,7 @@
#include <linux/module.h>
#include <linux/slab.h>
#include <linux/hardirq.h>
+#include <linux/ftrace.h>
#include <asm/cacheflush.h>
#include <asm/sections.h>
#include <asm/dis.h>
@@ -60,10 +61,21 @@ struct kprobe_insn_cache kprobe_dmainsn_slots = {
static void __kprobes copy_instruction(struct kprobe *p)
{
+ unsigned long ip = (unsigned long) p->addr;
s64 disp, new_disp;
u64 addr, new_addr;
- memcpy(p->ainsn.insn, p->addr, insn_length(p->opcode >> 8));
+ if (ftrace_location(ip) == ip) {
+ /*
+ * If kprobes patches the instruction that is morphed by
+ * ftrace make sure that kprobes always sees the branch
+ * "jg .+24" that skips the mcount block
+ */
+ ftrace_generate_nop_insn((struct ftrace_insn *)p->ainsn.insn);
+ p->ainsn.is_ftrace_insn = 1;
+ } else
+ memcpy(p->ainsn.insn, p->addr, insn_length(p->opcode >> 8));
+ p->opcode = p->ainsn.insn[0];
if (!probe_is_insn_relative_long(p->ainsn.insn))
return;
/*
@@ -85,18 +97,6 @@ static inline int is_kernel_addr(void *addr)
return addr < (void *)_end;
}
-static inline int is_module_addr(void *addr)
-{
-#ifdef CONFIG_64BIT
- BUILD_BUG_ON(MODULES_LEN > (1UL << 31));
- if (addr < (void *)MODULES_VADDR)
- return 0;
- if (addr > (void *)MODULES_END)
- return 0;
-#endif
- return 1;
-}
-
static int __kprobes s390_get_insn_slot(struct kprobe *p)
{
/*
@@ -132,43 +132,58 @@ int __kprobes arch_prepare_kprobe(struct kprobe *p)
return -EINVAL;
if (s390_get_insn_slot(p))
return -ENOMEM;
- p->opcode = *p->addr;
copy_instruction(p);
return 0;
}
-struct ins_replace_args {
- kprobe_opcode_t *ptr;
- kprobe_opcode_t opcode;
+struct swap_insn_args {
+ struct kprobe *p;
+ unsigned int arm_kprobe : 1;
};
-static int __kprobes swap_instruction(void *aref)
+static int __kprobes swap_instruction(void *data)
{
struct kprobe_ctlblk *kcb = get_kprobe_ctlblk();
unsigned long status = kcb->kprobe_status;
- struct ins_replace_args *args = aref;
-
+ struct swap_insn_args *args = data;
+ struct ftrace_insn new_insn, *insn;
+ struct kprobe *p = args->p;
+ size_t len;
+
+ new_insn.opc = args->arm_kprobe ? BREAKPOINT_INSTRUCTION : p->opcode;
+ len = sizeof(new_insn.opc);
+ if (!p->ainsn.is_ftrace_insn)
+ goto skip_ftrace;
+ len = sizeof(new_insn);
+ insn = (struct ftrace_insn *) p->addr;
+ if (args->arm_kprobe) {
+ if (is_ftrace_nop(insn))
+ new_insn.disp = KPROBE_ON_FTRACE_NOP;
+ else
+ new_insn.disp = KPROBE_ON_FTRACE_CALL;
+ } else {
+ ftrace_generate_call_insn(&new_insn, (unsigned long)p->addr);
+ if (insn->disp == KPROBE_ON_FTRACE_NOP)
+ ftrace_generate_nop_insn(&new_insn);
+ }
+skip_ftrace:
kcb->kprobe_status = KPROBE_SWAP_INST;
- probe_kernel_write(args->ptr, &args->opcode, sizeof(args->opcode));
+ probe_kernel_write(p->addr, &new_insn, len);
kcb->kprobe_status = status;
return 0;
}
void __kprobes arch_arm_kprobe(struct kprobe *p)
{
- struct ins_replace_args args;
+ struct swap_insn_args args = {.p = p, .arm_kprobe = 1};
- args.ptr = p->addr;
- args.opcode = BREAKPOINT_INSTRUCTION;
stop_machine(swap_instruction, &args, NULL);
}
void __kprobes arch_disarm_kprobe(struct kprobe *p)
{
- struct ins_replace_args args;
+ struct swap_insn_args args = {.p = p, .arm_kprobe = 0};
- args.ptr = p->addr;
- args.opcode = p->opcode;
stop_machine(swap_instruction, &args, NULL);
}
@@ -459,6 +474,24 @@ static void __kprobes resume_execution(struct kprobe *p, struct pt_regs *regs)
unsigned long ip = regs->psw.addr & PSW_ADDR_INSN;
int fixup = probe_get_fixup_type(p->ainsn.insn);
+ /* Check if the kprobes location is an enabled ftrace caller */
+ if (p->ainsn.is_ftrace_insn) {
+ struct ftrace_insn *insn = (struct ftrace_insn *) p->addr;
+ struct ftrace_insn call_insn;
+
+ ftrace_generate_call_insn(&call_insn, (unsigned long) p->addr);
+ /*
+ * A kprobe on an enabled ftrace call site actually single
+ * stepped an unconditional branch (ftrace nop equivalent).
+ * Now we need to fixup things and pretend that a brasl r0,...
+ * was executed instead.
+ */
+ if (insn->disp == KPROBE_ON_FTRACE_CALL) {
+ ip += call_insn.disp * 2 - MCOUNT_INSN_SIZE;
+ regs->gprs[0] = (unsigned long)p->addr + sizeof(*insn);
+ }
+ }
+
if (fixup & FIXUP_PSW_NORMAL)
ip += (unsigned long) p->addr - (unsigned long) p->ainsn.insn;
diff --git a/arch/s390/kernel/mcount.S b/arch/s390/kernel/mcount.S
index 4300ea374826..b6dfc5bfcb89 100644
--- a/arch/s390/kernel/mcount.S
+++ b/arch/s390/kernel/mcount.S
@@ -27,6 +27,7 @@ ENTRY(ftrace_caller)
.globl ftrace_regs_caller
.set ftrace_regs_caller,ftrace_caller
lgr %r1,%r15
+ aghi %r0,MCOUNT_RETURN_FIXUP
aghi %r15,-STACK_FRAME_SIZE
stg %r1,__SF_BACKCHAIN(%r15)
stg %r1,(STACK_PTREGS_GPRS+15*8)(%r15)
diff --git a/arch/s390/kernel/setup.c b/arch/s390/kernel/setup.c
index e80d9ff9a56d..4e532c67832f 100644
--- a/arch/s390/kernel/setup.c
+++ b/arch/s390/kernel/setup.c
@@ -41,7 +41,6 @@
#include <linux/ctype.h>
#include <linux/reboot.h>
#include <linux/topology.h>
-#include <linux/ftrace.h>
#include <linux/kexec.h>
#include <linux/crash_dump.h>
#include <linux/memory.h>
@@ -356,7 +355,6 @@ static void __init setup_lowcore(void)
lc->steal_timer = S390_lowcore.steal_timer;
lc->last_update_timer = S390_lowcore.last_update_timer;
lc->last_update_clock = S390_lowcore.last_update_clock;
- lc->ftrace_func = S390_lowcore.ftrace_func;
restart_stack = __alloc_bootmem(ASYNC_SIZE, ASYNC_SIZE, 0);
restart_stack += ASYNC_SIZE;
diff --git a/arch/s390/kernel/smp.c b/arch/s390/kernel/smp.c
index 6fd9e60101f1..0b499f5cbe19 100644
--- a/arch/s390/kernel/smp.c
+++ b/arch/s390/kernel/smp.c
@@ -236,7 +236,6 @@ static void pcpu_prepare_secondary(struct pcpu *pcpu, int cpu)
lc->percpu_offset = __per_cpu_offset[cpu];
lc->kernel_asce = S390_lowcore.kernel_asce;
lc->machine_flags = S390_lowcore.machine_flags;
- lc->ftrace_func = S390_lowcore.ftrace_func;
lc->user_timer = lc->system_timer = lc->steal_timer = 0;
__ctl_store(lc->cregs_save_area, 0, 15);
save_access_regs((unsigned int *) lc->access_regs_save_area);
diff --git a/scripts/recordmcount.c b/scripts/recordmcount.c
index 001facfa5b74..3d1984e59a30 100644
--- a/scripts/recordmcount.c
+++ b/scripts/recordmcount.c
@@ -404,7 +404,7 @@ do_file(char const *const fname)
}
if (w2(ghdr->e_machine) == EM_S390) {
reltype = R_390_64;
- mcount_adjust_64 = -8;
+ mcount_adjust_64 = -14;
}
if (w2(ghdr->e_machine) == EM_MIPS) {
reltype = R_MIPS_64;
diff --git a/scripts/recordmcount.pl b/scripts/recordmcount.pl
index d4b665610d67..56ea99a12ab7 100755
--- a/scripts/recordmcount.pl
+++ b/scripts/recordmcount.pl
@@ -243,7 +243,7 @@ if ($arch eq "x86_64") {
} elsif ($arch eq "s390" && $bits == 64) {
$mcount_regex = "^\\s*([0-9a-fA-F]+):\\s*R_390_(PC|PLT)32DBL\\s+_mcount\\+0x2\$";
- $mcount_adjust = -8;
+ $mcount_adjust = -14;
$alignment = 8;
$type = ".quad";
$ld .= " -m elf64_s390";
--
1.8.5.5
Hi Heiko,
(2014/10/16 0:46), Heiko Carstens wrote:
> Hi all,
>
> we would like to implement an architecture specific variant of "kprobes
> on ftrace" without using the current HAVE_KPROBES_ON_FTRACE infrastructure
> which is currently only used by x86.
>
> The rationale for these two patches is:
> - we want to patch the first instruction of the mcount code block to
> reduce the overhead of the function tracer
> - we'd like to keep the ftrace_caller function as simple as possible and
> not require it to generate a 100% valid pt_regs structure as required
> by the combination of DYNAMIC_FTRACE_WITH_REGS and HAVE_KPROBES_ON_FTRACE.
> This allows us to not generate the psw mask field in the pt_regs
> structure on each function tracer enabled function, which otherwise would
> be very expensive. Besides that program check generated pt_regs contents
> are "more" accurate than program generated ones and don't require any
> maintenance.
> And also we can keep the ftrace and kprobes backends quite separated.
I'm not sure about s390 nor have the machine, so it is very helpful if you
give us a command line level test and show us the result with this patch :)
Fortunately, we already have ftracetest under tools/tesitng/selftest/ftrace/.
You can add the testcase for checking co-existence of kprobes and ftrace on
an entry of a function.
And also, since ftrace is now supporting assembly trampoline code for each
handler, performance overhead can be reduced if we save registers only when
the kprobes enabled on the function. I'm not sure it can implement on s390,
but your requirement looks similar. What would you think about that?
Thank you,
>
> In order to make this work a small common code change is necessary which
> removes a check if kprobe is being placed on an ftrace location (see
> first patch).
>
> If possible, I'd like to have an ACK from at least one of the kprobes
> maintainers for the first patch and bring it upstream via the s390 tree.
>
> Thanks,
> Heiko
>
> Heiko Carstens (2):
> kprobes: introduce ARCH_HANDLES_KPROBES_ON_FTRACE
> s390/ftrace,kprobes: allow to patch first instruction
>
> arch/Kconfig | 8 +++
> arch/s390/Kconfig | 1 +
> arch/s390/include/asm/ftrace.h | 52 ++++++++++++++--
> arch/s390/include/asm/kprobes.h | 1 +
> arch/s390/include/asm/lowcore.h | 4 +-
> arch/s390/include/asm/pgtable.h | 12 ++++
> arch/s390/kernel/asm-offsets.c | 1 -
> arch/s390/kernel/early.c | 4 --
> arch/s390/kernel/ftrace.c | 132 +++++++++++++++++++++++++---------------
> arch/s390/kernel/kprobes.c | 87 ++++++++++++++++++--------
> arch/s390/kernel/mcount.S | 1 +
> arch/s390/kernel/setup.c | 2 -
> arch/s390/kernel/smp.c | 1 -
> kernel/kprobes.c | 3 +-
> scripts/recordmcount.c | 2 +-
> scripts/recordmcount.pl | 2 +-
> 16 files changed, 220 insertions(+), 93 deletions(-)
>
--
Masami HIRAMATSU
Software Platform Research Dept. Linux Technology Research Center
Hitachi, Ltd., Yokohama Research Laboratory
E-mail: [email protected]
Hi Masami,
thank you for your comments!
On Thu, Oct 16, 2014 at 02:49:56PM +0900, Masami Hiramatsu wrote:
> (2014/10/16 0:46), Heiko Carstens wrote:
> > we would like to implement an architecture specific variant of "kprobes
> > on ftrace" without using the current HAVE_KPROBES_ON_FTRACE infrastructure
> > which is currently only used by x86.
> >
> > The rationale for these two patches is:
> > - we want to patch the first instruction of the mcount code block to
> > reduce the overhead of the function tracer
> > - we'd like to keep the ftrace_caller function as simple as possible and
> > not require it to generate a 100% valid pt_regs structure as required
> > by the combination of DYNAMIC_FTRACE_WITH_REGS and HAVE_KPROBES_ON_FTRACE.
> > This allows us to not generate the psw mask field in the pt_regs
> > structure on each function tracer enabled function, which otherwise would
> > be very expensive. Besides that program check generated pt_regs contents
> > are "more" accurate than program generated ones and don't require any
> > maintenance.
> > And also we can keep the ftrace and kprobes backends quite separated.
>
> I'm not sure about s390 nor have the machine, so it is very helpful if you
> give us a command line level test and show us the result with this patch :)
> Fortunately, we already have ftracetest under tools/tesitng/selftest/ftrace/.
> You can add the testcase for checking co-existence of kprobes and ftrace on
> an entry of a function.
Ok, my personal test case was just a kernel module, that added/removed a kprobe
while also enabling/disabling function tracing and verifying that both the
kprobes handler got called correctly and correct entries were added to the
function trace buffer.
Everything worked as expected, however I think I can come up with a test case
that will fit into the ftrace selftests.
I just need to get familiar with the kprobe dynamic event interface (again) ;)
> And also, since ftrace is now supporting assembly trampoline code for each
> handler, performance overhead can be reduced if we save registers only when
> the kprobes enabled on the function. I'm not sure it can implement on s390,
> but your requirement looks similar. What would you think about that?
Yes, Vojtech Pavlik did send patches which implemented that. But that resulted
in a lot of asm code duplication which I didn't feel comfortable with.
Right now the s390 variant of ftrace_regs_caller() is an alias to
ftrace_caller() which means we only have one piece of code which handles both
variants.
I'm still thinking of a different solution which handles the creation of the
pt_regs contents 100% correct with an own trampoline for ftrace_regs_caller(),
but that's a different story.
However, this is more or less independently to the question of what you think
about allowing an architecture to handle kprobes on ftrace completely in the
architecture backend.
>From a pure technical point of view both versions can be implemented and would
also work on s390:
- either handling kprobes on ftrace completely by the architecture
- or implement a "full" version of DYNAMIC_FTRACE_WITH_REGS and then also
implement a similar HAVE_KPROBES_ON_FTRACE backend like x86 already has
Personally I'd prefer handling kprobes on ftrace completely by the
architecture backend, because both Martin and I think this simply looks
better, and keeps things more separated.
Given that this preference only requires removal of a sanity check in
kprobes.c ("don't put a kprobe on an ftrace location") I'd say this shouldn't
be a problem, no?
Thanks,
Heiko
On Thu, Oct 16, 2014 at 02:49:56PM +0900, Masami Hiramatsu wrote:
> Hi Heiko,
>
> (2014/10/16 0:46), Heiko Carstens wrote:
> > Hi all,
> >
> > we would like to implement an architecture specific variant of "kprobes
> > on ftrace" without using the current HAVE_KPROBES_ON_FTRACE infrastructure
> > which is currently only used by x86.
[...]
> I'm not sure about s390 nor have the machine, so it is very helpful if you
> give us a command line level test and show us the result with this patch :)
> Fortunately, we already have ftracetest under tools/tesitng/selftest/ftrace/.
> You can add the testcase for checking co-existence of kprobes and ftrace on
> an entry of a function.
So how about something like below?
>From e7ba7796bfc7179aad1c612571d330cae4c048b7 Mon Sep 17 00:00:00 2001
From: Heiko Carstens <[email protected]>
Date: Fri, 17 Oct 2014 10:01:03 +0200
Subject: [PATCH] ftracetest: add kprobes on ftrace testcase
Add a kprobes on ftrace testcase. The testcase verifies that
- enabling and disabling function tracing works on a function which
already contains a dynamic kprobe
- adding and removing a dynamic kprobe works on a function which is
already enabled for function tracing
Signed-off-by: Heiko Carstens <[email protected]>
---
.../ftrace/test.d/kprobe/kprobe_ftrace.tc | 56 ++++++++++++++++++++++
1 file changed, 56 insertions(+)
create mode 100644 tools/testing/selftests/ftrace/test.d/kprobe/kprobe_ftrace.tc
diff --git a/tools/testing/selftests/ftrace/test.d/kprobe/kprobe_ftrace.tc b/tools/testing/selftests/ftrace/test.d/kprobe/kprobe_ftrace.tc
new file mode 100644
index 000000000000..4ddd18054fda
--- /dev/null
+++ b/tools/testing/selftests/ftrace/test.d/kprobe/kprobe_ftrace.tc
@@ -0,0 +1,56 @@
+#!/bin/sh
+# description: Kprobe dynamic event with function tracer
+
+[ -f kprobe_events ] || exit_unsupported # this is configurable
+grep function available_tracers || exit_unsupported # this is configurable
+
+# prepare
+echo nop > current_tracer
+echo do_fork > set_ftrace_filter
+echo 0 > events/enable
+echo > kprobe_events
+echo 'p:testprobe do_fork' > kprobe_events
+
+# kprobe on / ftrace off
+echo 1 > events/kprobes/testprobe/enable
+echo > trace
+( echo "forked")
+grep testprobe trace
+! grep 'do_fork <-' trace
+
+# kprobe on / ftrace on
+echo function > current_tracer
+echo > trace
+( echo "forked")
+grep testprobe trace
+grep 'do_fork <-' trace
+
+# kprobe off / ftrace on
+echo 0 > events/kprobes/testprobe/enable
+echo > trace
+( echo "forked")
+! grep testprobe trace
+grep 'do_fork <-' trace
+
+# kprobe on / ftrace on
+echo 1 > events/kprobes/testprobe/enable
+echo function > current_tracer
+echo > trace
+( echo "forked")
+echo > trace
+grep testprobe trace
+grep 'do_fork <-' trace
+
+# kprobe on / ftrace off
+echo nop > current_tracer
+echo > trace
+( echo "forked")
+grep testprobe trace
+! grep 'do_fork <-' trace
+
+# cleanup
+echo nop > current_tracer
+echo > set_ftrace_filter
+echo 0 > events/kprobes/testprobe/enable
+echo > kprobe_events
+echo > trace
--
1.8.5.5
On Thu, Oct 16, 2014 at 02:49:56PM +0900, Masami Hiramatsu wrote:
> I'm not sure about s390 nor have the machine, so it is very helpful if you
> give us a command line level test and show us the result with this patch :)
> Fortunately, we already have ftracetest under tools/tesitng/selftest/ftrace/.
> You can add the testcase for checking co-existence of kprobes and ftrace on
> an entry of a function.
FWIW, I was also surprised to see that the order of the ftrace testcases is
not sorted. Maybe you might consider the patch below as well:
>From a659098f63f188e603316e4c4ccb435bf360987a Mon Sep 17 00:00:00 2001
From: Heiko Carstens <[email protected]>
Date: Fri, 17 Oct 2014 10:08:35 +0200
Subject: [PATCH] ftracetest: sort testcases
Make sure the order of the executed testcases is always the same.
Signed-off-by: Heiko Carstens <[email protected]>
---
tools/testing/selftests/ftrace/ftracetest | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/testing/selftests/ftrace/ftracetest b/tools/testing/selftests/ftrace/ftracetest
index a8f81c782856..2007a2cde56f 100755
--- a/tools/testing/selftests/ftrace/ftracetest
+++ b/tools/testing/selftests/ftrace/ftracetest
@@ -37,7 +37,7 @@ abspath() {
}
find_testcases() { #directory
- echo `find $1 -name \*.tc`
+ echo `find $1 -name \*.tc | sort`
}
parse_opts() { # opts
--
1.8.5.5
On Fri, Oct 17, 2014 at 10:19:21AM +0200, Heiko Carstens wrote:
> +# kprobe on / ftrace on
> +echo 1 > events/kprobes/testprobe/enable
> +echo function > current_tracer
> +echo > trace
> +( echo "forked")
> +echo > trace
^^^ --- this line shouldn't be here, sorry.
If this patch is acceptable, please remove the line.. or I can resend a
proper version.
(2014/10/17 17:21), Heiko Carstens wrote:
> On Thu, Oct 16, 2014 at 02:49:56PM +0900, Masami Hiramatsu wrote:
>> I'm not sure about s390 nor have the machine, so it is very helpful if you
>> give us a command line level test and show us the result with this patch :)
>> Fortunately, we already have ftracetest under tools/tesitng/selftest/ftrace/.
>> You can add the testcase for checking co-existence of kprobes and ftrace on
>> an entry of a function.
>
> FWIW, I was also surprised to see that the order of the ftrace testcases is
> not sorted. Maybe you might consider the patch below as well:
Ah, right. I've misunderstood that was already sorted.
Could you resend it as a separated patch, with my ack?
Acked-by: Masami Hiramatsu <[email protected]>
Thank you,
>
>>From a659098f63f188e603316e4c4ccb435bf360987a Mon Sep 17 00:00:00 2001
> From: Heiko Carstens <[email protected]>
> Date: Fri, 17 Oct 2014 10:08:35 +0200
> Subject: [PATCH] ftracetest: sort testcases
>
> Make sure the order of the executed testcases is always the same.
>
> Signed-off-by: Heiko Carstens <[email protected]>
> ---
> tools/testing/selftests/ftrace/ftracetest | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/tools/testing/selftests/ftrace/ftracetest b/tools/testing/selftests/ftrace/ftracetest
> index a8f81c782856..2007a2cde56f 100755
> --- a/tools/testing/selftests/ftrace/ftracetest
> +++ b/tools/testing/selftests/ftrace/ftracetest
> @@ -37,7 +37,7 @@ abspath() {
> }
>
> find_testcases() { #directory
> - echo `find $1 -name \*.tc`
> + echo `find $1 -name \*.tc | sort`
> }
>
> parse_opts() { # opts
>
--
Masami HIRAMATSU
Software Platform Research Dept. Linux Technology Research Center
Hitachi, Ltd., Yokohama Research Laboratory
E-mail: [email protected]
(2014/10/16 0:46), Heiko Carstens wrote:
> Allow architectures to implement handling of kprobes on function
> tracer call sites on their own, without depending on common code.
>
> This patch removes the kprobes check if a kprobe is being placed
> on a function tracer call site and therefore gives full responsibility
> of handling this correctly to the architecture.
>
> Signed-off-by: Heiko Carstens <[email protected]>
> ---
> arch/Kconfig | 8 ++++++++
> kernel/kprobes.c | 3 ++-
> 2 files changed, 10 insertions(+), 1 deletion(-)
>
> diff --git a/arch/Kconfig b/arch/Kconfig
> index 05d7a8a458d5..e1a8e0edf03f 100644
> --- a/arch/Kconfig
> +++ b/arch/Kconfig
> @@ -85,6 +85,14 @@ config KPROBES_ON_FTRACE
> passing of pt_regs to function tracing, then kprobes can
> optimize on top of function tracing.
>
> +config ARCH_HANDLES_KPROBES_ON_FTRACE
> + def_bool n
> + help
> + If an architecture can handle kprobes on function tracer call
> + sites on own, then this option should be selected. This option
> + removes the check which otherwise prevents to set kprobes on
> + function tracer call sites.
> +
> config UPROBES
> def_bool n
> select PERCPU_RWSEM
> diff --git a/kernel/kprobes.c b/kernel/kprobes.c
> index 3995f546d0f3..4cc48aa67635 100644
> --- a/kernel/kprobes.c
> +++ b/kernel/kprobes.c
> @@ -1428,7 +1428,8 @@ static int check_kprobe_address_safe(struct kprobe *p,
> return -EILSEQ;
> p->flags |= KPROBE_FLAG_FTRACE;
> #else /* !CONFIG_KPROBES_ON_FTRACE */
> - return -EINVAL;
> + if (!IS_ENABLED(CONFIG_ARCH_HANDLES_KPROBES_ON_FTRACE))
> + return -EINVAL;
> #endif
> }
Hmm, this looks a bit not straight. Maybe we'd better introduce a local
check_ftrace_location() function which always returns 0 if
CONFIG_ARCH_HANDLES_KPROBES_ON_FTRACE(with a comment! :)) as below.
int check_ftrace_location(kp)
{
unsigned long ftrace_address;
/* If an architecture handles kprobes on ftrace, we don't check it */
if (IS_ENABLED(CONFIG_ARCH_HANDLES_KPROBES_ON_FTRACE))
return 0;
...
}
Thank you,
--
Masami HIRAMATSU
Software Platform Research Dept. Linux Technology Research Center
Hitachi, Ltd., Yokohama Research Laboratory
E-mail: [email protected]
(2014/10/17 17:19), Heiko Carstens wrote:
> On Thu, Oct 16, 2014 at 02:49:56PM +0900, Masami Hiramatsu wrote:
>> Hi Heiko,
>>
>> (2014/10/16 0:46), Heiko Carstens wrote:
>>> Hi all,
>>>
>>> we would like to implement an architecture specific variant of "kprobes
>>> on ftrace" without using the current HAVE_KPROBES_ON_FTRACE infrastructure
>>> which is currently only used by x86.
>
> [...]
>
>> I'm not sure about s390 nor have the machine, so it is very helpful if you
>> give us a command line level test and show us the result with this patch :)
>> Fortunately, we already have ftracetest under tools/tesitng/selftest/ftrace/.
>> You can add the testcase for checking co-existence of kprobes and ftrace on
>> an entry of a function.
>
> So how about something like below?
Yes! :) And could you add the results before and after to patch 2/2,
so that we can see what it changes on s390 ?
Thank you!
>
>>From e7ba7796bfc7179aad1c612571d330cae4c048b7 Mon Sep 17 00:00:00 2001
> From: Heiko Carstens <[email protected]>
> Date: Fri, 17 Oct 2014 10:01:03 +0200
> Subject: [PATCH] ftracetest: add kprobes on ftrace testcase
>
> Add a kprobes on ftrace testcase. The testcase verifies that
> - enabling and disabling function tracing works on a function which
> already contains a dynamic kprobe
> - adding and removing a dynamic kprobe works on a function which is
> already enabled for function tracing
>
> Signed-off-by: Heiko Carstens <[email protected]>
> ---
> .../ftrace/test.d/kprobe/kprobe_ftrace.tc | 56 ++++++++++++++++++++++
> 1 file changed, 56 insertions(+)
> create mode 100644 tools/testing/selftests/ftrace/test.d/kprobe/kprobe_ftrace.tc
>
> diff --git a/tools/testing/selftests/ftrace/test.d/kprobe/kprobe_ftrace.tc b/tools/testing/selftests/ftrace/test.d/kprobe/kprobe_ftrace.tc
> new file mode 100644
> index 000000000000..4ddd18054fda
> --- /dev/null
> +++ b/tools/testing/selftests/ftrace/test.d/kprobe/kprobe_ftrace.tc
> @@ -0,0 +1,56 @@
> +#!/bin/sh
> +# description: Kprobe dynamic event with function tracer
> +
> +[ -f kprobe_events ] || exit_unsupported # this is configurable
> +grep function available_tracers || exit_unsupported # this is configurable
> +
> +# prepare
> +echo nop > current_tracer
> +echo do_fork > set_ftrace_filter
> +echo 0 > events/enable
> +echo > kprobe_events
> +echo 'p:testprobe do_fork' > kprobe_events
> +
> +# kprobe on / ftrace off
> +echo 1 > events/kprobes/testprobe/enable
> +echo > trace
> +( echo "forked")
> +grep testprobe trace
> +! grep 'do_fork <-' trace
> +
> +# kprobe on / ftrace on
> +echo function > current_tracer
> +echo > trace
> +( echo "forked")
> +grep testprobe trace
> +grep 'do_fork <-' trace
> +
> +# kprobe off / ftrace on
> +echo 0 > events/kprobes/testprobe/enable
> +echo > trace
> +( echo "forked")
> +! grep testprobe trace
> +grep 'do_fork <-' trace
> +
> +# kprobe on / ftrace on
> +echo 1 > events/kprobes/testprobe/enable
> +echo function > current_tracer
> +echo > trace
> +( echo "forked")
> +echo > trace
> +grep testprobe trace
> +grep 'do_fork <-' trace
> +
> +# kprobe on / ftrace off
> +echo nop > current_tracer
> +echo > trace
> +( echo "forked")
> +grep testprobe trace
> +! grep 'do_fork <-' trace
> +
> +# cleanup
> +echo nop > current_tracer
> +echo > set_ftrace_filter
> +echo 0 > events/kprobes/testprobe/enable
> +echo > kprobe_events
> +echo > trace
>
--
Masami HIRAMATSU
Software Platform Research Dept. Linux Technology Research Center
Hitachi, Ltd., Yokohama Research Laboratory
E-mail: [email protected]
Hi Masami,
On Mon, Oct 20, 2014 at 11:02:49AM +0900, Masami Hiramatsu wrote:
> (2014/10/17 17:19), Heiko Carstens wrote:
> > On Thu, Oct 16, 2014 at 02:49:56PM +0900, Masami Hiramatsu wrote:
> >> Hi Heiko,
> >>
> >> (2014/10/16 0:46), Heiko Carstens wrote:
> >>> Hi all,
> >>>
> >>> we would like to implement an architecture specific variant of "kprobes
> >>> on ftrace" without using the current HAVE_KPROBES_ON_FTRACE infrastructure
> >>> which is currently only used by x86.
> >
> > [...]
> >
> >> I'm not sure about s390 nor have the machine, so it is very helpful if you
> >> give us a command line level test and show us the result with this patch :)
> >> Fortunately, we already have ftracetest under tools/tesitng/selftest/ftrace/.
> >> You can add the testcase for checking co-existence of kprobes and ftrace on
> >> an entry of a function.
> >
> > So how about something like below?
>
> Yes! :) And could you add the results before and after to patch 2/2,
> so that we can see what it changes on s390 ?
The output of the testcase is identical before and after patch 2/2. Maybe I
didn't explain my intention good enough.
Just to explain how mcount/ftrace works on s390 today: if we pass the "-pg"
flag to the compiler a 24 byte(!) block will be added in front of every
function. We patch that block to match the ftrace needs. So an ftrace "nop"
looks like this (simplified):
0: load return address "24" into register
6: jump to 24
12: nop
18: nop
24: <function code>
If the function gets enabled for ftrace we will patch the instruction at
offset 6:
0: load return address "24" into register
6: jump to ftrace_caller
12: nop
18: nop
24: <function code>
So in fact kprobes and ftrace do work nicely together, since we only patch
the second instruction, while kprobes will put a breakpoint on the first
instruction.
However, what I want to achieve with patch 2/2 is that the code will look
like this:
ftrace disabled:
0: jump to 24
6: nop
12: nop
18: nop
24: <function code>
ftrace enabled:
0: branch to ftrace_caller and save return address into register
6: nop
12: nop
18: nop
24: <function code>
So, with patch 2/2 the ftrace location of a function now matches the first
instruction of a function and the check within kprobes.c which prevents
putting a kprobe on an ftrace location triggers.
So kprobes and ftrace work with and without patch 2/2, all I want to achieve
is that the overhead of the mcount block gets reduced to a single instruction.
Ultimately we want also a compiler change which only emits a single
instruction, which we can patch; probably similar to "-pg -mfentry" on x86.
On Mon, 20 Oct 2014 10:55:30 +0900
Masami Hiramatsu <[email protected]> wrote:
> Hmm, this looks a bit not straight. Maybe we'd better introduce a local
> check_ftrace_location() function which always returns 0 if
> CONFIG_ARCH_HANDLES_KPROBES_ON_FTRACE(with a comment! :)) as below.
>
> int check_ftrace_location(kp)
> {
> unsigned long ftrace_address;
>
> /* If an architecture handles kprobes on ftrace, we don't check it */
> if (IS_ENABLED(CONFIG_ARCH_HANDLES_KPROBES_ON_FTRACE))
> return 0;
>
> ...
> }
We can also just make that function weak, and let the archs override
the default behavior?
-- Steve
(2014/10/21 3:53), Steven Rostedt wrote:
> On Mon, 20 Oct 2014 10:55:30 +0900
> Masami Hiramatsu <[email protected]> wrote:
>
>
>> Hmm, this looks a bit not straight. Maybe we'd better introduce a local
>> check_ftrace_location() function which always returns 0 if
>> CONFIG_ARCH_HANDLES_KPROBES_ON_FTRACE(with a comment! :)) as below.
>>
>> int check_ftrace_location(kp)
>> {
>> unsigned long ftrace_address;
>>
>> /* If an architecture handles kprobes on ftrace, we don't check it */
>> if (IS_ENABLED(CONFIG_ARCH_HANDLES_KPROBES_ON_FTRACE))
>> return 0;
>>
>> ...
>> }
>
> We can also just make that function weak, and let the archs override
> the default behavior?
Ah, that will be simpler and we don't need new Kconfig.
Thank you!
--
Masami HIRAMATSU
Software Platform Research Dept. Linux Technology Research Center
Hitachi, Ltd., Yokohama Research Laboratory
E-mail: [email protected]
(2014/10/16 19:57), Heiko Carstens wrote:
> Hi Masami,
>
> thank you for your comments!
>
> On Thu, Oct 16, 2014 at 02:49:56PM +0900, Masami Hiramatsu wrote:
>> (2014/10/16 0:46), Heiko Carstens wrote:
>>> we would like to implement an architecture specific variant of "kprobes
>>> on ftrace" without using the current HAVE_KPROBES_ON_FTRACE infrastructure
>>> which is currently only used by x86.
>>>
>>> The rationale for these two patches is:
>>> - we want to patch the first instruction of the mcount code block to
>>> reduce the overhead of the function tracer
>>> - we'd like to keep the ftrace_caller function as simple as possible and
>>> not require it to generate a 100% valid pt_regs structure as required
>>> by the combination of DYNAMIC_FTRACE_WITH_REGS and HAVE_KPROBES_ON_FTRACE.
>>> This allows us to not generate the psw mask field in the pt_regs
>>> structure on each function tracer enabled function, which otherwise would
>>> be very expensive. Besides that program check generated pt_regs contents
>>> are "more" accurate than program generated ones and don't require any
>>> maintenance.
>>> And also we can keep the ftrace and kprobes backends quite separated.
>>
>> I'm not sure about s390 nor have the machine, so it is very helpful if you
>> give us a command line level test and show us the result with this patch :)
>> Fortunately, we already have ftracetest under tools/tesitng/selftest/ftrace/.
>> You can add the testcase for checking co-existence of kprobes and ftrace on
>> an entry of a function.
>
> Ok, my personal test case was just a kernel module, that added/removed a kprobe
> while also enabling/disabling function tracing and verifying that both the
> kprobes handler got called correctly and correct entries were added to the
> function trace buffer.
> Everything worked as expected, however I think I can come up with a test case
> that will fit into the ftrace selftests.
> I just need to get familiar with the kprobe dynamic event interface (again) ;)
>
>> And also, since ftrace is now supporting assembly trampoline code for each
>> handler, performance overhead can be reduced if we save registers only when
>> the kprobes enabled on the function. I'm not sure it can implement on s390,
>> but your requirement looks similar. What would you think about that?
>
> Yes, Vojtech Pavlik did send patches which implemented that. But that resulted
> in a lot of asm code duplication which I didn't feel comfortable with.
> Right now the s390 variant of ftrace_regs_caller() is an alias to
> ftrace_caller() which means we only have one piece of code which handles both
> variants.
> I'm still thinking of a different solution which handles the creation of the
> pt_regs contents 100% correct with an own trampoline for ftrace_regs_caller(),
> but that's a different story.
>
> However, this is more or less independently to the question of what you think
> about allowing an architecture to handle kprobes on ftrace completely in the
> architecture backend.
>
>>From a pure technical point of view both versions can be implemented and would
> also work on s390:
> - either handling kprobes on ftrace completely by the architecture
> - or implement a "full" version of DYNAMIC_FTRACE_WITH_REGS and then also
> implement a similar HAVE_KPROBES_ON_FTRACE backend like x86 already has
>
> Personally I'd prefer handling kprobes on ftrace completely by the
> architecture backend, because both Martin and I think this simply looks
> better, and keeps things more separated.
> Given that this preference only requires removal of a sanity check in
> kprobes.c ("don't put a kprobe on an ftrace location") I'd say this shouldn't
> be a problem, no?
>From the functional point of view, if we can set the kprobes on
ftrace site, that is OK for me. And I'm sure your series doesn't
make change according to your test case. So I think it's OK :)
Thank you,
--
Masami HIRAMATSU
Software Platform Research Dept. Linux Technology Research Center
Hitachi, Ltd., Yokohama Research Laboratory
E-mail: [email protected]