2022-08-26 08:21:25

by Qing Zhang

[permalink] [raw]
Subject: [PATCH v2 0/9] LoongArch: Add ftrace support

This patch series to support basic and dynamic ftrace.

1) -pg
Use `-pg` makes stub like a child function `void _mcount(void *ra)`.
Thus, it can be seen store RA and open stack before `call _mcount`.
Find `open stack` at first, and then find `store RA`.

2) -fpatchable-function-entry=2
The compiler has inserted 2 NOPs before the regular function prologue.
T series registers are available and safe because of LoongArch psABI.

At runtime, replace nop with bl to enable ftrace call and replace bl with
nop to disable ftrace call. The bl requires us to save the original RA value,
so here it saves RA at t0.
details are:

| Compiled | Disabled | Enabled |
+------------+------------------------+------------------------+
| nop | move t0, ra | move t0, ra |
| nop | nop | bl ftrace_caller |
| func_body | func_body | func_body |

The RA value will be recovered by ftrace_regs_entry, and restored into RA
before returning to the regular function prologue. When a function is not
being traced, the move t0, ra is not harmful.

performs a series of startup tests on ftrace and The test cases in selftests
has passed on LoongArch.

Changes in v2:
- Remove patch "LoongArch: ftrace: Add CALLER_ADDRx macros" there are other
better ways
Suggested by Steve:
- Add HAVE_DYNAMIC_FTRACE_WITH_ARGS support (6/9)
Suggested by Jinyang:
- Change addu16id to lu12iw and Adjust module_finalize return value (7/9)
- Use the "jr" pseudo-instruction where applicable (1/9)
- Use the "la.pcrel" instead of "la" (3/9)

Qing Zhang (9):
LoongArch/ftrace: Add basic support
LoongArch/ftrace: Add recordmcount support
LoongArch/ftrace: Add dynamic function tracer support
LoongArch/ftrace: Add dynamic function graph tracer support
LoongArch/ftrace: Add DYNAMIC_FTRACE_WITH_REGS support
LoongArch/ftrace: Add HAVE_DYNAMIC_FTRACE_WITH_ARGS support
LoongArch: modules/ftrace: Initialize PLT at load time
LoongArch/ftrace: Add HAVE_FUNCTION_GRAPH_RET_ADDR_PTR support
LoongArch: Enable CONFIG_KALLSYMS_ALL and CONFIG_DEBUG_FS

arch/loongarch/Kconfig | 7 +
arch/loongarch/Makefile | 5 +
arch/loongarch/configs/loongson3_defconfig | 2 +
arch/loongarch/include/asm/ftrace.h | 61 +++++
arch/loongarch/include/asm/inst.h | 36 +++
arch/loongarch/include/asm/module.h | 5 +-
arch/loongarch/include/asm/module.lds.h | 1 +
arch/loongarch/include/asm/unwind.h | 1 +
arch/loongarch/kernel/Makefile | 13 +
arch/loongarch/kernel/entry_dyn.S | 154 ++++++++++++
arch/loongarch/kernel/ftrace.c | 74 ++++++
arch/loongarch/kernel/ftrace_dyn.c | 264 +++++++++++++++++++++
arch/loongarch/kernel/inst.c | 127 ++++++++++
arch/loongarch/kernel/mcount.S | 94 ++++++++
arch/loongarch/kernel/module-sections.c | 11 +
arch/loongarch/kernel/module.c | 47 ++++
arch/loongarch/kernel/unwind_guess.c | 4 +-
arch/loongarch/kernel/unwind_prologue.c | 10 +-
scripts/recordmcount.c | 23 ++
19 files changed, 936 insertions(+), 3 deletions(-)
create mode 100644 arch/loongarch/include/asm/ftrace.h
create mode 100644 arch/loongarch/kernel/entry_dyn.S
create mode 100644 arch/loongarch/kernel/ftrace.c
create mode 100644 arch/loongarch/kernel/ftrace_dyn.c
create mode 100644 arch/loongarch/kernel/mcount.S

--
2.20.1


2022-08-26 08:23:53

by Qing Zhang

[permalink] [raw]
Subject: [PATCH v2 3/9] LoongArch/ftrace: Add dynamic function tracer support

The compiler has inserted 2 NOPs before the regular function prologue.
T series registers are available and safe because of LoongArch psABI.

At runtime, replace nop with bl to enable ftrace call and replace bl with
nop to disable ftrace call. The bl requires us to save the original RA value,
so here it saves RA at t0.
details are:

| Compiled | Disabled | Enabled |
+------------+------------------------+------------------------+
| nop | move t0, ra | move t0, ra |
| nop | nop | bl ftrace_caller |
| func_body | func_body | func_body |

The RA value will be recovered by ftrace_regs_entry, and restored into RA
before returning to the regular function prologue. When a function is not
being traced, the move t0, ra is not harmful.

1) ftrace_make_call, ftrace_make_nop (in kernel/ftrace.c)
The two functions turn each recorded call site of filtered functions
into a call to ftrace_caller or nops.

2) ftracce_update_ftrace_func (in kernel/ftrace.c)
turns the nops at ftrace_call into a call to a generic entry for
function tracers.

3) ftrace_caller (in kernel/mcount-dyn.S)
The entry where each _mcount call sites calls to once they are
filtered to be traced.

Co-developed-by: Jinyang He <[email protected]>
Signed-off-by: Jinyang He <[email protected]>
Signed-off-by: Qing Zhang <[email protected]>
---
arch/loongarch/Kconfig | 1 +
arch/loongarch/include/asm/ftrace.h | 16 ++++
arch/loongarch/include/asm/inst.h | 33 +++++++++
arch/loongarch/kernel/Makefile | 5 ++
arch/loongarch/kernel/entry_dyn.S | 89 ++++++++++++++++++++++
arch/loongarch/kernel/ftrace_dyn.c | 111 ++++++++++++++++++++++++++++
arch/loongarch/kernel/inst.c | 92 +++++++++++++++++++++++
7 files changed, 347 insertions(+)
create mode 100644 arch/loongarch/kernel/entry_dyn.S
create mode 100644 arch/loongarch/kernel/ftrace_dyn.c

diff --git a/arch/loongarch/Kconfig b/arch/loongarch/Kconfig
index 4d13bac368ed..f2d4899b1a0e 100644
--- a/arch/loongarch/Kconfig
+++ b/arch/loongarch/Kconfig
@@ -83,6 +83,7 @@ config LOONGARCH
select HAVE_C_RECORDMCOUNT
select HAVE_DEBUG_STACKOVERFLOW
select HAVE_DMA_CONTIGUOUS
+ select HAVE_DYNAMIC_FTRACE
select HAVE_EXIT_THREAD
select HAVE_FAST_GUP
select HAVE_FTRACE_MCOUNT_RECORD
diff --git a/arch/loongarch/include/asm/ftrace.h b/arch/loongarch/include/asm/ftrace.h
index 6a3e76234618..76ca58767f4d 100644
--- a/arch/loongarch/include/asm/ftrace.h
+++ b/arch/loongarch/include/asm/ftrace.h
@@ -10,9 +10,25 @@
#define MCOUNT_INSN_SIZE 4 /* sizeof mcount call */

#ifndef __ASSEMBLY__
+#ifndef CONFIG_DYNAMIC_FTRACE
extern void _mcount(void);
#define mcount _mcount
+#endif

+#ifdef CONFIG_DYNAMIC_FTRACE
+static inline unsigned long ftrace_call_adjust(unsigned long addr)
+{
+ return addr;
+}
+
+struct dyn_arch_ftrace {
+};
+
+struct dyn_ftrace;
+int ftrace_init_nop(struct module *mod, struct dyn_ftrace *rec);
+#define ftrace_init_nop ftrace_init_nop
+
+#endif /* CONFIG_DYNAMIC_FTRACE */
#endif /* __ASSEMBLY__ */
#endif /* CONFIG_FUNCTION_TRACER */
#endif /* _ASM_LOONGARCH_FTRACE_H */
diff --git a/arch/loongarch/include/asm/inst.h b/arch/loongarch/include/asm/inst.h
index 7b07cbb3188c..713b4996bfac 100644
--- a/arch/loongarch/include/asm/inst.h
+++ b/arch/loongarch/include/asm/inst.h
@@ -8,6 +8,9 @@
#include <linux/types.h>
#include <asm/asm.h>

+#define INSN_NOP 0x03400000
+#define INSN_BREAK 0x002a0000
+
#define ADDR_IMMMASK_LU52ID 0xFFF0000000000000
#define ADDR_IMMMASK_LU32ID 0x000FFFFF00000000
#define ADDR_IMMMASK_ADDU16ID 0x00000000FFFF0000
@@ -18,6 +21,11 @@

#define ADDR_IMM(addr, INSN) ((addr & ADDR_IMMMASK_##INSN) >> ADDR_IMMSHIFT_##INSN)

+enum reg0i26_op {
+ b_op = 0x14,
+ bl_op = 0x15,
+};
+
enum reg1i20_op {
lu12iw_op = 0x0a,
lu32id_op = 0x0b,
@@ -32,6 +40,7 @@ enum reg2i12_op {
addiw_op = 0x0a,
addid_op = 0x0b,
lu52id_op = 0x0c,
+ ori_op = 0x0e,
ldb_op = 0xa0,
ldh_op = 0xa1,
ldw_op = 0xa2,
@@ -52,6 +61,10 @@ enum reg2i16_op {
bgeu_op = 0x1b,
};

+enum reg3_op {
+ or_op = 0x2a,
+};
+
struct reg0i26_format {
unsigned int immediate_h : 10;
unsigned int immediate_l : 16;
@@ -85,6 +98,13 @@ struct reg2i16_format {
unsigned int opcode : 6;
};

+struct reg3_format {
+ unsigned int rd : 5;
+ unsigned int rj : 5;
+ unsigned int rk : 5;
+ unsigned int opcode : 17;
+};
+
union loongarch_instruction {
unsigned int word;
struct reg0i26_format reg0i26_format;
@@ -92,6 +112,7 @@ union loongarch_instruction {
struct reg1i21_format reg1i21_format;
struct reg2i12_format reg2i12_format;
struct reg2i16_format reg2i16_format;
+ struct reg3_format reg3_format;
};

#define LOONGARCH_INSN_SIZE sizeof(union loongarch_instruction)
@@ -162,6 +183,18 @@ static inline bool is_stack_alloc_ins(union loongarch_instruction *ip)
is_imm12_negative(ip->reg2i12_format.immediate);
}

+int larch_insn_read(void *addr, u32 *insnp);
+int larch_insn_write(void *addr, u32 insn);
+int larch_insn_patch_text(void *addr, u32 insn);
+
+u32 larch_insn_gen_nop(void);
+u32 larch_insn_gen_b(unsigned long pc, unsigned long dest);
+u32 larch_insn_gen_bl(unsigned long pc, unsigned long dest);
+
+u32 larch_insn_gen_or(enum loongarch_gpr rd, enum loongarch_gpr rj,
+ enum loongarch_gpr rk);
+u32 larch_insn_gen_move(enum loongarch_gpr rd, enum loongarch_gpr rj);
+
u32 larch_insn_gen_lu32id(enum loongarch_gpr rd, int imm);
u32 larch_insn_gen_lu52id(enum loongarch_gpr rd, enum loongarch_gpr rj, int imm);
u32 larch_insn_gen_jirl(enum loongarch_gpr rd, enum loongarch_gpr rj, unsigned long pc, unsigned long dest);
diff --git a/arch/loongarch/kernel/Makefile b/arch/loongarch/kernel/Makefile
index 0a745d24d3e5..a73599619466 100644
--- a/arch/loongarch/kernel/Makefile
+++ b/arch/loongarch/kernel/Makefile
@@ -15,8 +15,13 @@ obj-$(CONFIG_EFI) += efi.o
obj-$(CONFIG_CPU_HAS_FPU) += fpu.o

ifdef CONFIG_FUNCTION_TRACER
+ ifndef CONFIG_DYNAMIC_FTRACE
obj-y += mcount.o ftrace.o
CFLAGS_REMOVE_ftrace.o = $(CC_FLAGS_FTRACE)
+ else
+ obj-y += entry_dyn.o ftrace_dyn.o
+ CFLAGS_REMOVE_ftrace_dyn.o = $(CC_FLAGS_FTRACE)
+ endif
CFLAGS_REMOVE_inst.o = $(CC_FLAGS_FTRACE)
CFLAGS_REMOVE_time.o = $(CC_FLAGS_FTRACE)
CFLAGS_REMOVE_perf_event.o = $(CC_FLAGS_FTRACE)
diff --git a/arch/loongarch/kernel/entry_dyn.S b/arch/loongarch/kernel/entry_dyn.S
new file mode 100644
index 000000000000..205925bc3822
--- /dev/null
+++ b/arch/loongarch/kernel/entry_dyn.S
@@ -0,0 +1,89 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright (C) 2022 Loongson Technology Corporation Limited
+ */
+
+#include <asm/export.h>
+#include <asm/regdef.h>
+#include <asm/stackframe.h>
+#include <asm/ftrace.h>
+
+ .text
+/*
+ * Due to -fpatchable-function-entry=2: the compiler inserted 2 NOPs before the
+ * regular C function prologue. When PC arrived here, the last 2 instructions
+ * as follows,
+ * move t0, ra
+ * bl callsite (for modules, callsite is a tramplione)
+ *
+ * modules tramplione as follows,
+ * lu12i.w t1, callsite[31:12]
+ * lu32i.d t1, callsite[51:32]
+ * lu52i.d t1, t1, callsite[63:52]
+ * jirl zero, t1, callsite[11:0] >> 2
+ *
+ * See arch/loongarch/kernel/ftrace_dyn.c for details. Here, pay attention to
+ * that the T series regs are available and safe because each C functions
+ * follows the LoongArch psABI well.
+ */
+
+ .macro ftrace_regs_entry
+ PTR_ADDI sp, sp, -PT_SIZE
+ /* Save trace function ra at PT_ERA */
+ PTR_S ra, sp, PT_ERA
+ /* Save parent ra at PT_R1(RA) */
+ PTR_S t0, sp, PT_R1
+ PTR_S a0, sp, PT_R4
+ PTR_S a1, sp, PT_R5
+ PTR_S a2, sp, PT_R6
+ PTR_S a3, sp, PT_R7
+ PTR_S a4, sp, PT_R8
+ PTR_S a5, sp, PT_R9
+ PTR_S a6, sp, PT_R10
+ PTR_S a7, sp, PT_R11
+ PTR_S fp, sp, PT_R22
+
+ PTR_ADDI t8, sp, PT_SIZE
+ PTR_S t8, sp, PT_R3
+
+ .endm
+
+SYM_CODE_START(ftrace_caller)
+ ftrace_regs_entry
+ b ftrace_common
+SYM_CODE_END(ftrace_caller)
+
+SYM_CODE_START(ftrace_common)
+ PTR_ADDI a0, ra, -8 /* arg0: ip */
+ move a1, t0 /* arg1: parent_ip */
+ la.pcrel t1, function_trace_op
+ PTR_L a2, t1, 0 /* arg2: op */
+ move a3, sp /* arg3: regs */
+ .globl ftrace_call
+ftrace_call:
+ bl ftrace_stub
+/*
+ * As we didn't use S series regs in this assmembly code and all calls
+ * are C function which will save S series regs by themselves, there is
+ * no need to restore S series regs. The T series is available and safe
+ * at the callsite, so there is no need to restore the T series regs.
+ */
+ftrace_common_return:
+ PTR_L a0, sp, PT_R4
+ PTR_L a1, sp, PT_R5
+ PTR_L a2, sp, PT_R6
+ PTR_L a3, sp, PT_R7
+ PTR_L a4, sp, PT_R8
+ PTR_L a5, sp, PT_R9
+ PTR_L a6, sp, PT_R10
+ PTR_L a7, sp, PT_R11
+ PTR_L fp, sp, PT_R22
+ PTR_L ra, sp, PT_R1
+ PTR_L t0, sp, PT_ERA
+ PTR_ADDI sp, sp, PT_SIZE
+ jr t0
+SYM_CODE_END(ftrace_common)
+
+SYM_FUNC_START(ftrace_stub)
+ jr ra
+SYM_FUNC_END(ftrace_stub)
diff --git a/arch/loongarch/kernel/ftrace_dyn.c b/arch/loongarch/kernel/ftrace_dyn.c
new file mode 100644
index 000000000000..1f8955be8b64
--- /dev/null
+++ b/arch/loongarch/kernel/ftrace_dyn.c
@@ -0,0 +1,111 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Based on arch/arm64/kernel/ftrace.c
+ *
+ * Copyright (C) 2022 Loongson Technology Corporation Limited
+ */
+
+#include <linux/ftrace.h>
+#include <linux/uaccess.h>
+
+#include <asm/inst.h>
+
+static int ftrace_modify_code(unsigned long pc, u32 old, u32 new,
+ bool validate)
+{
+ u32 replaced;
+
+ if (validate) {
+ if (larch_insn_read((void *)pc, &replaced))
+ return -EFAULT;
+
+ if (replaced != old)
+ return -EINVAL;
+ }
+
+ if (larch_insn_patch_text((void *)pc, new))
+ return -EPERM;
+
+ return 0;
+}
+
+int ftrace_update_ftrace_func(ftrace_func_t func)
+{
+ unsigned long pc;
+ u32 new;
+
+ pc = (unsigned long)&ftrace_call;
+ new = larch_insn_gen_bl(pc, (unsigned long)func);
+
+ return ftrace_modify_code(pc, 0, new, false);
+}
+
+/*
+ * The compiler has inserted 2 NOPs before the regular function prologue.
+ * T series registers are available and safe because of LoongArch psABI.
+ *
+ * At runtime, replace nop with bl to enable ftrace call and replace bl with
+ * nop to disable ftrace call. The bl requires us to save the original RA value,
+ * so here it saves RA at t0.
+ * details are:
+ *
+ * | Compiled | Disabled | Enabled |
+ * +------------+------------------------+------------------------+
+ * | nop | move t0, ra | move t0, ra |
+ * | nop | nop | bl ftrace_caller |
+ * | func_body | func_body | func_body |
+ *
+ * The RA value will be recovered by ftrace_regs_entry, and restored into RA
+ * before returning to the regular function prologue. When a function is not
+ * being traced, the move t0, ra is not harmful.
+ */
+
+int ftrace_init_nop(struct module *mod, struct dyn_ftrace *rec)
+{
+ unsigned long pc;
+ u32 old, new;
+
+ pc = rec->ip;
+ old = larch_insn_gen_nop();
+ new = larch_insn_gen_move(LOONGARCH_GPR_T0, LOONGARCH_GPR_RA);
+
+ return ftrace_modify_code(pc, old, new, true);
+}
+
+int ftrace_make_call(struct dyn_ftrace *rec, unsigned long addr)
+{
+ unsigned long pc;
+ u32 old, new;
+
+ pc = rec->ip + LOONGARCH_INSN_SIZE;
+
+ old = larch_insn_gen_nop();
+ new = larch_insn_gen_bl(pc, addr);
+
+ return ftrace_modify_code(pc, old, new, true);
+}
+
+int ftrace_make_nop(struct module *mod, struct dyn_ftrace *rec,
+ unsigned long addr)
+{
+ unsigned long pc;
+ u32 old, new;
+
+ pc = rec->ip + LOONGARCH_INSN_SIZE;
+
+ new = larch_insn_gen_nop();
+ old = larch_insn_gen_bl(pc, addr);
+
+ return ftrace_modify_code(pc, old, new, true);
+}
+
+void arch_ftrace_update_code(int command)
+{
+ command |= FTRACE_MAY_SLEEP;
+ ftrace_modify_all_code(command);
+}
+
+int __init ftrace_dyn_arch_init(void)
+{
+ return 0;
+}
diff --git a/arch/loongarch/kernel/inst.c b/arch/loongarch/kernel/inst.c
index b1df0ec34bd1..d62cdf4a9ffb 100644
--- a/arch/loongarch/kernel/inst.c
+++ b/arch/loongarch/kernel/inst.c
@@ -2,8 +2,83 @@
/*
* Copyright (C) 2020-2022 Loongson Technology Corporation Limited
*/
+#include <linux/sizes.h>
+#include <linux/uaccess.h>
+
+#include <asm/cacheflush.h>
#include <asm/inst.h>

+static DEFINE_RAW_SPINLOCK(patch_lock);
+
+int larch_insn_read(void *addr, u32 *insnp)
+{
+ int ret;
+ u32 val;
+
+ ret = copy_from_kernel_nofault(&val, addr, LOONGARCH_INSN_SIZE);
+ if (!ret)
+ *insnp = val;
+
+ return ret;
+}
+
+int larch_insn_write(void *addr, u32 insn)
+{
+ int ret;
+ unsigned long flags = 0;
+
+ raw_spin_lock_irqsave(&patch_lock, flags);
+ ret = copy_to_kernel_nofault(addr, &insn, LOONGARCH_INSN_SIZE);
+ raw_spin_unlock_irqrestore(&patch_lock, flags);
+
+ return ret;
+}
+
+int larch_insn_patch_text(void *addr, u32 insn)
+{
+ int ret;
+ u32 *tp = addr;
+
+ if ((unsigned long)tp & 3)
+ return -EINVAL;
+
+ ret = larch_insn_write(tp, insn);
+ if (!ret)
+ flush_icache_range((unsigned long)tp,
+ (unsigned long)tp + LOONGARCH_INSN_SIZE);
+
+ return ret;
+}
+
+u32 larch_insn_gen_nop(void)
+{
+ return INSN_NOP;
+}
+
+u32 larch_insn_gen_bl(unsigned long pc, unsigned long dest)
+{
+ unsigned int immediate_l, immediate_h;
+ union loongarch_instruction insn;
+ long offset = dest - pc;
+
+ if ((offset & 3) || offset < -SZ_128M || offset >= SZ_128M) {
+ pr_warn("The generated bl instruction is out of range.\n");
+ return INSN_BREAK;
+ }
+
+ offset >>= 2;
+
+ immediate_l = offset & 0xffff;
+ offset >>= 16;
+ immediate_h = offset & 0x3ff;
+
+ insn.reg0i26_format.opcode = bl_op;
+ insn.reg0i26_format.immediate_l = immediate_l;
+ insn.reg0i26_format.immediate_h = immediate_h;
+
+ return insn.word;
+}
+
u32 larch_insn_gen_lu32id(enum loongarch_gpr rd, int imm)
{
union loongarch_instruction insn;
@@ -38,3 +113,20 @@ u32 larch_insn_gen_jirl(enum loongarch_gpr rd, enum loongarch_gpr rj, unsigned l

return insn.word;
}
+
+u32 larch_insn_gen_or(enum loongarch_gpr rd, enum loongarch_gpr rj, enum loongarch_gpr rk)
+{
+ union loongarch_instruction insn;
+
+ insn.reg3_format.opcode = or_op;
+ insn.reg3_format.rd = rd;
+ insn.reg3_format.rj = rj;
+ insn.reg3_format.rk = rk;
+
+ return insn.word;
+}
+
+u32 larch_insn_gen_move(enum loongarch_gpr rd, enum loongarch_gpr rj)
+{
+ return larch_insn_gen_or(rd, rj, 0);
+}
--
2.20.1

2022-08-26 08:36:52

by Qing Zhang

[permalink] [raw]
Subject: [PATCH v2 4/9] LoongArch/ftrace: Add dynamic function graph tracer support

Once the function_graph tracer is enabled, a filtered function has the
following call sequence:

1) ftracer_caller ==> on/off by ftrace_make_call/ftrace_make_nop
2) ftrace_graph_caller
3) ftrace_graph_call ==> on/off by ftrace_en/disable_ftrace_graph_caller
4) prepare_ftrace_return

Considering the following DYNAMIC_FTRACE_WITH_REGS feature, it would be
more extendable to have a ftrace_graph_caller function, instead of
calling prepare_ftrace_return directly in ftrace_caller.

Co-developed-by: Jinyang He <[email protected]>
Signed-off-by: Jinyang He <[email protected]>
Signed-off-by: Qing Zhang <[email protected]>
---
arch/loongarch/kernel/entry_dyn.S | 33 ++++++++++++++++++++++
arch/loongarch/kernel/ftrace_dyn.c | 45 ++++++++++++++++++++++++++++++
arch/loongarch/kernel/inst.c | 24 ++++++++++++++++
3 files changed, 102 insertions(+)

diff --git a/arch/loongarch/kernel/entry_dyn.S b/arch/loongarch/kernel/entry_dyn.S
index 205925bc3822..0c12cc108e6f 100644
--- a/arch/loongarch/kernel/entry_dyn.S
+++ b/arch/loongarch/kernel/entry_dyn.S
@@ -62,6 +62,11 @@ SYM_CODE_START(ftrace_common)
.globl ftrace_call
ftrace_call:
bl ftrace_stub
+#ifdef CONFIG_FUNCTION_GRAPH_TRACER
+ .globl ftrace_graph_call
+ftrace_graph_call:
+ nop /* b ftrace_graph_caller */
+#endif
/*
* As we didn't use S series regs in this assmembly code and all calls
* are C function which will save S series regs by themselves, there is
@@ -84,6 +89,34 @@ ftrace_common_return:
jr t0
SYM_CODE_END(ftrace_common)

+#ifdef CONFIG_FUNCTION_GRAPH_TRACER
+SYM_CODE_START(ftrace_graph_caller)
+ PTR_L a0, sp, PT_ERA
+ PTR_ADDI a0, a0, -8 /* arg0: self_addr */
+ PTR_ADDI a1, sp, PT_R1 /* arg1: parent */
+ bl prepare_ftrace_return
+ b ftrace_common_return
+SYM_CODE_END(ftrace_graph_caller)
+
+SYM_CODE_START(return_to_handler)
+ /* save return value regs */
+ PTR_ADDI sp, sp, -2 * SZREG
+ PTR_S a0, sp, 0
+ PTR_S a1, sp, SZREG
+
+ move a0, zero /* Has no check FP now. */
+ bl ftrace_return_to_handler
+ move ra, a0 /* parent ra */
+
+ /* restore return value regs */
+ PTR_L a0, sp, 0
+ PTR_L a1, sp, SZREG
+ PTR_ADDI sp, sp, 2 * SZREG
+
+ jr ra
+SYM_CODE_END(return_to_handler)
+#endif
+
SYM_FUNC_START(ftrace_stub)
jr ra
SYM_FUNC_END(ftrace_stub)
diff --git a/arch/loongarch/kernel/ftrace_dyn.c b/arch/loongarch/kernel/ftrace_dyn.c
index 1f8955be8b64..3fe791b6783e 100644
--- a/arch/loongarch/kernel/ftrace_dyn.c
+++ b/arch/loongarch/kernel/ftrace_dyn.c
@@ -109,3 +109,48 @@ int __init ftrace_dyn_arch_init(void)
{
return 0;
}
+
+#ifdef CONFIG_FUNCTION_GRAPH_TRACER
+extern void ftrace_graph_call(void);
+
+void prepare_ftrace_return(unsigned long self_addr, unsigned long *parent)
+{
+ unsigned long return_hooker = (unsigned long)&return_to_handler;
+ unsigned long old;
+
+ if (unlikely(atomic_read(&current->tracing_graph_pause)))
+ return;
+
+ old = *parent;
+
+ if (!function_graph_enter(old, self_addr, 0, NULL))
+ *parent = return_hooker;
+}
+
+static int ftrace_modify_graph_caller(bool enable)
+{
+ unsigned long pc, func;
+ u32 branch, nop;
+
+ pc = (unsigned long)&ftrace_graph_call;
+ func = (unsigned long)&ftrace_graph_caller;
+
+ branch = larch_insn_gen_b(pc, func);
+ nop = larch_insn_gen_nop();
+
+ if (enable)
+ return ftrace_modify_code(pc, nop, branch, true);
+ else
+ return ftrace_modify_code(pc, branch, nop, true);
+}
+
+int ftrace_enable_ftrace_graph_caller(void)
+{
+ return ftrace_modify_graph_caller(true);
+}
+
+int ftrace_disable_ftrace_graph_caller(void)
+{
+ return ftrace_modify_graph_caller(false);
+}
+#endif /* CONFIG_FUNCTION_GRAPH_TRACER */
diff --git a/arch/loongarch/kernel/inst.c b/arch/loongarch/kernel/inst.c
index d62cdf4a9ffb..2d2e942eb06a 100644
--- a/arch/loongarch/kernel/inst.c
+++ b/arch/loongarch/kernel/inst.c
@@ -55,6 +55,30 @@ u32 larch_insn_gen_nop(void)
return INSN_NOP;
}

+u32 larch_insn_gen_b(unsigned long pc, unsigned long dest)
+{
+ unsigned int immediate_l, immediate_h;
+ union loongarch_instruction insn;
+ long offset = dest - pc;
+
+ if ((offset & 3) || offset < -SZ_128M || offset >= SZ_128M) {
+ pr_warn("The generated b instruction is out of range.\n");
+ return INSN_BREAK;
+ }
+
+ offset >>= 2;
+
+ immediate_l = offset & 0xffff;
+ offset >>= 16;
+ immediate_h = offset & 0x3ff;
+
+ insn.reg0i26_format.opcode = b_op;
+ insn.reg0i26_format.immediate_l = immediate_l;
+ insn.reg0i26_format.immediate_h = immediate_h;
+
+ return insn.word;
+}
+
u32 larch_insn_gen_bl(unsigned long pc, unsigned long dest)
{
unsigned int immediate_l, immediate_h;
--
2.20.1

2022-08-26 08:38:19

by Qing Zhang

[permalink] [raw]
Subject: [PATCH v2 6/9] LoongArch/ftrace: Add HAVE_DYNAMIC_FTRACE_WITH_ARGS support

Allow for arguments to be passed in to ftrace_regs by default,
If this is set, then arguments and stack can be found from
the pt_regs.

1. HAVE_DYNAMIC_FTRACE_WITH_ARGS don't need special hook for graph
tracer entry point, but instead we can use graph_ops::func function
to install the return_hooker.
2. Livepatch requires this option in the future.

Signed-off-by: Qing Zhang <[email protected]>
---
arch/loongarch/Kconfig | 1 +
arch/loongarch/include/asm/ftrace.h | 17 +++++++++++++++++
arch/loongarch/kernel/ftrace_dyn.c | 12 ++++++++++++
3 files changed, 30 insertions(+)

diff --git a/arch/loongarch/Kconfig b/arch/loongarch/Kconfig
index 22eb3d6f8537..96902647b692 100644
--- a/arch/loongarch/Kconfig
+++ b/arch/loongarch/Kconfig
@@ -84,6 +84,7 @@ config LOONGARCH
select HAVE_DEBUG_STACKOVERFLOW
select HAVE_DMA_CONTIGUOUS
select HAVE_DYNAMIC_FTRACE
+ select HAVE_DYNAMIC_FTRACE_WITH_ARGS
select HAVE_DYNAMIC_FTRACE_WITH_REGS
select HAVE_EXIT_THREAD
select HAVE_FAST_GUP
diff --git a/arch/loongarch/include/asm/ftrace.h b/arch/loongarch/include/asm/ftrace.h
index a3f974a7a5ce..4a9db84f8264 100644
--- a/arch/loongarch/include/asm/ftrace.h
+++ b/arch/loongarch/include/asm/ftrace.h
@@ -28,6 +28,23 @@ struct dyn_ftrace;
int ftrace_init_nop(struct module *mod, struct dyn_ftrace *rec);
#define ftrace_init_nop ftrace_init_nop

+#ifdef CONFIG_HAVE_DYNAMIC_FTRACE_WITH_ARGS
+struct ftrace_ops;
+
+struct ftrace_regs {
+ struct pt_regs regs;
+};
+
+static __always_inline struct pt_regs *arch_ftrace_get_regs(struct ftrace_regs *fregs)
+{
+ return &fregs->regs;
+}
+
+void ftrace_graph_func(unsigned long ip, unsigned long parent_ip,
+ struct ftrace_ops *op, struct ftrace_regs *fregs);
+#define ftrace_graph_func ftrace_graph_func
+#endif
+
#ifdef CONFIG_DYNAMIC_FTRACE_WITH_REGS
#define ARCH_SUPPORTS_FTRACE_OPS 1
#endif
diff --git a/arch/loongarch/kernel/ftrace_dyn.c b/arch/loongarch/kernel/ftrace_dyn.c
index ec3d951be50c..f538829312d7 100644
--- a/arch/loongarch/kernel/ftrace_dyn.c
+++ b/arch/loongarch/kernel/ftrace_dyn.c
@@ -144,6 +144,17 @@ void prepare_ftrace_return(unsigned long self_addr, unsigned long *parent)
*parent = return_hooker;
}

+#ifdef CONFIG_HAVE_DYNAMIC_FTRACE_WITH_ARGS
+void ftrace_graph_func(unsigned long ip, unsigned long parent_ip,
+ struct ftrace_ops *op, struct ftrace_regs *fregs)
+{
+ struct pt_regs *regs = &fregs->regs;
+ unsigned long *parent = (unsigned long *)&regs->regs[1];
+
+ prepare_ftrace_return(ip, (unsigned long *)parent);
+}
+#else
+
static int ftrace_modify_graph_caller(bool enable)
{
unsigned long pc, func;
@@ -170,4 +181,5 @@ int ftrace_disable_ftrace_graph_caller(void)
{
return ftrace_modify_graph_caller(false);
}
+#endif /* CONFIG_HAVE_DYNAMIC_FTRACE_WITH_ARGS */
#endif /* CONFIG_FUNCTION_GRAPH_TRACER */
--
2.20.1

2022-09-01 03:08:14

by Jeff Xie

[permalink] [raw]
Subject: Re: [PATCH v2 3/9] LoongArch/ftrace: Add dynamic function tracer support

On Fri, Aug 26, 2022 at 4:24 PM Qing Zhang <[email protected]> wrote:
>
> The compiler has inserted 2 NOPs before the regular function prologue.
> T series registers are available and safe because of LoongArch psABI.
>
> At runtime, replace nop with bl to enable ftrace call and replace bl with
> nop to disable ftrace call. The bl requires us to save the original RA value,
> so here it saves RA at t0.
> details are:
>
> | Compiled | Disabled | Enabled |
> +------------+------------------------+------------------------+
> | nop | move t0, ra | move t0, ra |
> | nop | nop | bl ftrace_caller |
> | func_body | func_body | func_body |
>
> The RA value will be recovered by ftrace_regs_entry, and restored into RA
> before returning to the regular function prologue. When a function is not
> being traced, the move t0, ra is not harmful.
>
> 1) ftrace_make_call, ftrace_make_nop (in kernel/ftrace.c)
> The two functions turn each recorded call site of filtered functions
> into a call to ftrace_caller or nops.
>
> 2) ftracce_update_ftrace_func (in kernel/ftrace.c)
> turns the nops at ftrace_call into a call to a generic entry for
> function tracers.
>
> 3) ftrace_caller (in kernel/mcount-dyn.S)
> The entry where each _mcount call sites calls to once they are
> filtered to be traced.
>
> Co-developed-by: Jinyang He <[email protected]>
> Signed-off-by: Jinyang He <[email protected]>
> Signed-off-by: Qing Zhang <[email protected]>
> ---
> arch/loongarch/Kconfig | 1 +
> arch/loongarch/include/asm/ftrace.h | 16 ++++
> arch/loongarch/include/asm/inst.h | 33 +++++++++
> arch/loongarch/kernel/Makefile | 5 ++
> arch/loongarch/kernel/entry_dyn.S | 89 ++++++++++++++++++++++
> arch/loongarch/kernel/ftrace_dyn.c | 111 ++++++++++++++++++++++++++++
> arch/loongarch/kernel/inst.c | 92 +++++++++++++++++++++++
> 7 files changed, 347 insertions(+)
> create mode 100644 arch/loongarch/kernel/entry_dyn.S
> create mode 100644 arch/loongarch/kernel/ftrace_dyn.c
>
> diff --git a/arch/loongarch/Kconfig b/arch/loongarch/Kconfig
> index 4d13bac368ed..f2d4899b1a0e 100644
> --- a/arch/loongarch/Kconfig
> +++ b/arch/loongarch/Kconfig
> @@ -83,6 +83,7 @@ config LOONGARCH
> select HAVE_C_RECORDMCOUNT
> select HAVE_DEBUG_STACKOVERFLOW
> select HAVE_DMA_CONTIGUOUS
> + select HAVE_DYNAMIC_FTRACE
> select HAVE_EXIT_THREAD
> select HAVE_FAST_GUP
> select HAVE_FTRACE_MCOUNT_RECORD
> diff --git a/arch/loongarch/include/asm/ftrace.h b/arch/loongarch/include/asm/ftrace.h
> index 6a3e76234618..76ca58767f4d 100644
> --- a/arch/loongarch/include/asm/ftrace.h
> +++ b/arch/loongarch/include/asm/ftrace.h
> @@ -10,9 +10,25 @@
> #define MCOUNT_INSN_SIZE 4 /* sizeof mcount call */
>
> #ifndef __ASSEMBLY__
> +#ifndef CONFIG_DYNAMIC_FTRACE
> extern void _mcount(void);
> #define mcount _mcount
> +#endif
>
> +#ifdef CONFIG_DYNAMIC_FTRACE
> +static inline unsigned long ftrace_call_adjust(unsigned long addr)
> +{
> + return addr;
> +}
> +
> +struct dyn_arch_ftrace {
> +};
> +
> +struct dyn_ftrace;
> +int ftrace_init_nop(struct module *mod, struct dyn_ftrace *rec);
> +#define ftrace_init_nop ftrace_init_nop
> +
> +#endif /* CONFIG_DYNAMIC_FTRACE */
> #endif /* __ASSEMBLY__ */
> #endif /* CONFIG_FUNCTION_TRACER */
> #endif /* _ASM_LOONGARCH_FTRACE_H */
> diff --git a/arch/loongarch/include/asm/inst.h b/arch/loongarch/include/asm/inst.h
> index 7b07cbb3188c..713b4996bfac 100644
> --- a/arch/loongarch/include/asm/inst.h
> +++ b/arch/loongarch/include/asm/inst.h
> @@ -8,6 +8,9 @@
> #include <linux/types.h>
> #include <asm/asm.h>
>
> +#define INSN_NOP 0x03400000
> +#define INSN_BREAK 0x002a0000
> +
> #define ADDR_IMMMASK_LU52ID 0xFFF0000000000000
> #define ADDR_IMMMASK_LU32ID 0x000FFFFF00000000
> #define ADDR_IMMMASK_ADDU16ID 0x00000000FFFF0000
> @@ -18,6 +21,11 @@
>
> #define ADDR_IMM(addr, INSN) ((addr & ADDR_IMMMASK_##INSN) >> ADDR_IMMSHIFT_##INSN)
>
> +enum reg0i26_op {
> + b_op = 0x14,
> + bl_op = 0x15,
> +};
> +
> enum reg1i20_op {
> lu12iw_op = 0x0a,
> lu32id_op = 0x0b,
> @@ -32,6 +40,7 @@ enum reg2i12_op {
> addiw_op = 0x0a,
> addid_op = 0x0b,
> lu52id_op = 0x0c,
> + ori_op = 0x0e,
> ldb_op = 0xa0,
> ldh_op = 0xa1,
> ldw_op = 0xa2,
> @@ -52,6 +61,10 @@ enum reg2i16_op {
> bgeu_op = 0x1b,
> };
>
> +enum reg3_op {
> + or_op = 0x2a,
> +};
> +
> struct reg0i26_format {
> unsigned int immediate_h : 10;
> unsigned int immediate_l : 16;
> @@ -85,6 +98,13 @@ struct reg2i16_format {
> unsigned int opcode : 6;
> };
>
> +struct reg3_format {
> + unsigned int rd : 5;
> + unsigned int rj : 5;
> + unsigned int rk : 5;
> + unsigned int opcode : 17;
> +};
> +
> union loongarch_instruction {
> unsigned int word;
> struct reg0i26_format reg0i26_format;
> @@ -92,6 +112,7 @@ union loongarch_instruction {
> struct reg1i21_format reg1i21_format;
> struct reg2i12_format reg2i12_format;
> struct reg2i16_format reg2i16_format;
> + struct reg3_format reg3_format;
> };
>
> #define LOONGARCH_INSN_SIZE sizeof(union loongarch_instruction)
> @@ -162,6 +183,18 @@ static inline bool is_stack_alloc_ins(union loongarch_instruction *ip)
> is_imm12_negative(ip->reg2i12_format.immediate);
> }
>
> +int larch_insn_read(void *addr, u32 *insnp);
> +int larch_insn_write(void *addr, u32 insn);
> +int larch_insn_patch_text(void *addr, u32 insn);
> +
> +u32 larch_insn_gen_nop(void);
> +u32 larch_insn_gen_b(unsigned long pc, unsigned long dest);
> +u32 larch_insn_gen_bl(unsigned long pc, unsigned long dest);
> +
> +u32 larch_insn_gen_or(enum loongarch_gpr rd, enum loongarch_gpr rj,
> + enum loongarch_gpr rk);
> +u32 larch_insn_gen_move(enum loongarch_gpr rd, enum loongarch_gpr rj);
> +
> u32 larch_insn_gen_lu32id(enum loongarch_gpr rd, int imm);
> u32 larch_insn_gen_lu52id(enum loongarch_gpr rd, enum loongarch_gpr rj, int imm);
> u32 larch_insn_gen_jirl(enum loongarch_gpr rd, enum loongarch_gpr rj, unsigned long pc, unsigned long dest);
> diff --git a/arch/loongarch/kernel/Makefile b/arch/loongarch/kernel/Makefile
> index 0a745d24d3e5..a73599619466 100644
> --- a/arch/loongarch/kernel/Makefile
> +++ b/arch/loongarch/kernel/Makefile
> @@ -15,8 +15,13 @@ obj-$(CONFIG_EFI) += efi.o
> obj-$(CONFIG_CPU_HAS_FPU) += fpu.o
>
> ifdef CONFIG_FUNCTION_TRACER
> + ifndef CONFIG_DYNAMIC_FTRACE
> obj-y += mcount.o ftrace.o
> CFLAGS_REMOVE_ftrace.o = $(CC_FLAGS_FTRACE)
> + else
> + obj-y += entry_dyn.o ftrace_dyn.o
> + CFLAGS_REMOVE_ftrace_dyn.o = $(CC_FLAGS_FTRACE)
> + endif
> CFLAGS_REMOVE_inst.o = $(CC_FLAGS_FTRACE)
> CFLAGS_REMOVE_time.o = $(CC_FLAGS_FTRACE)
> CFLAGS_REMOVE_perf_event.o = $(CC_FLAGS_FTRACE)
> diff --git a/arch/loongarch/kernel/entry_dyn.S b/arch/loongarch/kernel/entry_dyn.S
> new file mode 100644
> index 000000000000..205925bc3822
> --- /dev/null
> +++ b/arch/loongarch/kernel/entry_dyn.S
> @@ -0,0 +1,89 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +/*
> + * Copyright (C) 2022 Loongson Technology Corporation Limited
> + */
> +
> +#include <asm/export.h>
> +#include <asm/regdef.h>
> +#include <asm/stackframe.h>
> +#include <asm/ftrace.h>
> +
> + .text
> +/*
> + * Due to -fpatchable-function-entry=2: the compiler inserted 2 NOPs before the
> + * regular C function prologue. When PC arrived here, the last 2 instructions
> + * as follows,
> + * move t0, ra
> + * bl callsite (for modules, callsite is a tramplione)
> + *
> + * modules tramplione as follows,
> + * lu12i.w t1, callsite[31:12]
> + * lu32i.d t1, callsite[51:32]
> + * lu52i.d t1, t1, callsite[63:52]
> + * jirl zero, t1, callsite[11:0] >> 2
> + *
> + * See arch/loongarch/kernel/ftrace_dyn.c for details. Here, pay attention to
> + * that the T series regs are available and safe because each C functions
> + * follows the LoongArch psABI well.
> + */
> +
> + .macro ftrace_regs_entry
> + PTR_ADDI sp, sp, -PT_SIZE
> + /* Save trace function ra at PT_ERA */
> + PTR_S ra, sp, PT_ERA
> + /* Save parent ra at PT_R1(RA) */
> + PTR_S t0, sp, PT_R1
> + PTR_S a0, sp, PT_R4
> + PTR_S a1, sp, PT_R5
> + PTR_S a2, sp, PT_R6
> + PTR_S a3, sp, PT_R7
> + PTR_S a4, sp, PT_R8
> + PTR_S a5, sp, PT_R9
> + PTR_S a6, sp, PT_R10
> + PTR_S a7, sp, PT_R11
> + PTR_S fp, sp, PT_R22
> +
> + PTR_ADDI t8, sp, PT_SIZE
> + PTR_S t8, sp, PT_R3
> +
> + .endm
> +
> +SYM_CODE_START(ftrace_caller)
> + ftrace_regs_entry
> + b ftrace_common
> +SYM_CODE_END(ftrace_caller)
> +
> +SYM_CODE_START(ftrace_common)
> + PTR_ADDI a0, ra, -8 /* arg0: ip */
> + move a1, t0 /* arg1: parent_ip */
> + la.pcrel t1, function_trace_op
> + PTR_L a2, t1, 0 /* arg2: op */
> + move a3, sp /* arg3: regs */
> + .globl ftrace_call
> +ftrace_call:
> + bl ftrace_stub
> +/*
> + * As we didn't use S series regs in this assmembly code and all calls
> + * are C function which will save S series regs by themselves, there is
> + * no need to restore S series regs. The T series is available and safe
> + * at the callsite, so there is no need to restore the T series regs.
> + */
> +ftrace_common_return:
> + PTR_L a0, sp, PT_R4
> + PTR_L a1, sp, PT_R5
> + PTR_L a2, sp, PT_R6
> + PTR_L a3, sp, PT_R7
> + PTR_L a4, sp, PT_R8
> + PTR_L a5, sp, PT_R9
> + PTR_L a6, sp, PT_R10
> + PTR_L a7, sp, PT_R11
> + PTR_L fp, sp, PT_R22
> + PTR_L ra, sp, PT_R1
> + PTR_L t0, sp, PT_ERA
> + PTR_ADDI sp, sp, PT_SIZE
> + jr t0
> +SYM_CODE_END(ftrace_common)
> +
> +SYM_FUNC_START(ftrace_stub)
> + jr ra
> +SYM_FUNC_END(ftrace_stub)
> diff --git a/arch/loongarch/kernel/ftrace_dyn.c b/arch/loongarch/kernel/ftrace_dyn.c
> new file mode 100644
> index 000000000000..1f8955be8b64
> --- /dev/null
> +++ b/arch/loongarch/kernel/ftrace_dyn.c
> @@ -0,0 +1,111 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * Based on arch/arm64/kernel/ftrace.c
> + *
> + * Copyright (C) 2022 Loongson Technology Corporation Limited
> + */
> +
> +#include <linux/ftrace.h>
> +#include <linux/uaccess.h>
> +
> +#include <asm/inst.h>
> +
> +static int ftrace_modify_code(unsigned long pc, u32 old, u32 new,
> + bool validate)
> +{
> + u32 replaced;
> +
> + if (validate) {
> + if (larch_insn_read((void *)pc, &replaced))
> + return -EFAULT;
> +
> + if (replaced != old)
> + return -EINVAL;
> + }
> +
> + if (larch_insn_patch_text((void *)pc, new))
> + return -EPERM;
> +
> + return 0;
> +}
> +
> +int ftrace_update_ftrace_func(ftrace_func_t func)
> +{
> + unsigned long pc;
> + u32 new;
> +
> + pc = (unsigned long)&ftrace_call;
> + new = larch_insn_gen_bl(pc, (unsigned long)func);
> +
> + return ftrace_modify_code(pc, 0, new, false);
> +}
> +
> +/*
> + * The compiler has inserted 2 NOPs before the regular function prologue.
> + * T series registers are available and safe because of LoongArch psABI.
> + *
> + * At runtime, replace nop with bl to enable ftrace call and replace bl with
> + * nop to disable ftrace call. The bl requires us to save the original RA value,
> + * so here it saves RA at t0.
> + * details are:
> + *
> + * | Compiled | Disabled | Enabled |
> + * +------------+------------------------+------------------------+
> + * | nop | move t0, ra | move t0, ra |
> + * | nop | nop | bl ftrace_caller |
> + * | func_body | func_body | func_body |
> + *
> + * The RA value will be recovered by ftrace_regs_entry, and restored into RA
> + * before returning to the regular function prologue. When a function is not
> + * being traced, the move t0, ra is not harmful.
> + */
> +
> +int ftrace_init_nop(struct module *mod, struct dyn_ftrace *rec)
> +{
> + unsigned long pc;
> + u32 old, new;
> +
> + pc = rec->ip;
> + old = larch_insn_gen_nop();
> + new = larch_insn_gen_move(LOONGARCH_GPR_T0, LOONGARCH_GPR_RA);
> +
> + return ftrace_modify_code(pc, old, new, true);
> +}
> +
> +int ftrace_make_call(struct dyn_ftrace *rec, unsigned long addr)
> +{
> + unsigned long pc;
> + u32 old, new;
> +
> + pc = rec->ip + LOONGARCH_INSN_SIZE;
> +
> + old = larch_insn_gen_nop();
> + new = larch_insn_gen_bl(pc, addr);
> +
> + return ftrace_modify_code(pc, old, new, true);
> +}
> +
> +int ftrace_make_nop(struct module *mod, struct dyn_ftrace *rec,
> + unsigned long addr)
> +{
> + unsigned long pc;
> + u32 old, new;
> +
> + pc = rec->ip + LOONGARCH_INSN_SIZE;
> +
> + new = larch_insn_gen_nop();
> + old = larch_insn_gen_bl(pc, addr);
> +
> + return ftrace_modify_code(pc, old, new, true);
> +}
> +
> +void arch_ftrace_update_code(int command)
> +{
> + command |= FTRACE_MAY_SLEEP;
> + ftrace_modify_all_code(command);
> +}
> +
> +int __init ftrace_dyn_arch_init(void)
> +{
> + return 0;
> +}
> diff --git a/arch/loongarch/kernel/inst.c b/arch/loongarch/kernel/inst.c
> index b1df0ec34bd1..d62cdf4a9ffb 100644
> --- a/arch/loongarch/kernel/inst.c
> +++ b/arch/loongarch/kernel/inst.c
> @@ -2,8 +2,83 @@
> /*
> * Copyright (C) 2020-2022 Loongson Technology Corporation Limited
> */
> +#include <linux/sizes.h>
> +#include <linux/uaccess.h>
> +
> +#include <asm/cacheflush.h>
> #include <asm/inst.h>
>
> +static DEFINE_RAW_SPINLOCK(patch_lock);
> +
> +int larch_insn_read(void *addr, u32 *insnp)
> +{
> + int ret;
> + u32 val;
> +
> + ret = copy_from_kernel_nofault(&val, addr, LOONGARCH_INSN_SIZE);
> + if (!ret)
> + *insnp = val;
> +
> + return ret;
> +}
> +
> +int larch_insn_write(void *addr, u32 insn)
> +{
> + int ret;
> + unsigned long flags = 0;
> +
> + raw_spin_lock_irqsave(&patch_lock, flags);
> + ret = copy_to_kernel_nofault(addr, &insn, LOONGARCH_INSN_SIZE);
> + raw_spin_unlock_irqrestore(&patch_lock, flags);
> +
> + return ret;
> +}
> +
> +int larch_insn_patch_text(void *addr, u32 insn)
> +{
> + int ret;
> + u32 *tp = addr;
> +
> + if ((unsigned long)tp & 3)
> + return -EINVAL;
> +
> + ret = larch_insn_write(tp, insn);
> + if (!ret)
> + flush_icache_range((unsigned long)tp,
> + (unsigned long)tp + LOONGARCH_INSN_SIZE);
> +
> + return ret;
> +}
> +
> +u32 larch_insn_gen_nop(void)
> +{
> + return INSN_NOP;
> +}
> +
> +u32 larch_insn_gen_bl(unsigned long pc, unsigned long dest)
> +{
> + unsigned int immediate_l, immediate_h;
> + union loongarch_instruction insn;
> + long offset = dest - pc;
> +
> + if ((offset & 3) || offset < -SZ_128M || offset >= SZ_128M) {
> + pr_warn("The generated bl instruction is out of range.\n");
> + return INSN_BREAK;
> + }
> +
> + offset >>= 2;
> +
> + immediate_l = offset & 0xffff;
> + offset >>= 16;
> + immediate_h = offset & 0x3ff;
> +
> + insn.reg0i26_format.opcode = bl_op;
> + insn.reg0i26_format.immediate_l = immediate_l;
> + insn.reg0i26_format.immediate_h = immediate_h;
> +
> + return insn.word;
> +}
> +
> u32 larch_insn_gen_lu32id(enum loongarch_gpr rd, int imm)
> {
> union loongarch_instruction insn;
> @@ -38,3 +113,20 @@ u32 larch_insn_gen_jirl(enum loongarch_gpr rd, enum loongarch_gpr rj, unsigned l
>
> return insn.word;
> }
> +
> +u32 larch_insn_gen_or(enum loongarch_gpr rd, enum loongarch_gpr rj, enum loongarch_gpr rk)
> +{
> + union loongarch_instruction insn;
> +
> + insn.reg3_format.opcode = or_op;
> + insn.reg3_format.rd = rd;
> + insn.reg3_format.rj = rj;
> + insn.reg3_format.rk = rk;
> +
> + return insn.word;
> +}
> +
> +u32 larch_insn_gen_move(enum loongarch_gpr rd, enum loongarch_gpr rj)
> +{
> + return larch_insn_gen_or(rd, rj, 0);
> +}
> --
> 2.20.1
>

When use the option func_stack_trace for the function tracer, I found a issue:

Steps:
1. Enable the function tracer and the option func_stack_trace:

/sys/kernel/tracing # echo blk_update_request > ./set_ftrace_filter
/sys/kernel/tracing # echo 1 > ./options/func_stack_trace
/sys/kernel/tracing # echo function > ./current_tracer

2. Let the blk_update_request() be called.

# mount /dev/vda /tmp


3. cat ./trace
<idle>-0 [000] ..s1. 126.016445: blk_update_request
<-blk_mq_end_request
<idle>-0 [000] ..s1. 126.017937: <stack trace>
=> blk_mq_end_request

We can see only one stack trace.


I found the default unwinder(for loongson3_defconfig) is
CONFIG_UNWINDER_PROLOGUE, if switch it to CONFIG_UNWINDER_GUESS
it works well:

3. cat ./trace
<idle>-0 [000] ..s1. 75.003356: blk_update_request
<-blk_mq_end_request
<idle>-0 [000] ..s1. 75.004963: <stack trace>
=> function_stack_trace_call
=> ftrace_graph_call
=> blk_mq_end_request
=> virtblk_done
=> vring_interrupt
=> __handle_irq_event_percpu
=> blk_update_request
=> handle_edge_irq
=> blk_complete_reqs
=> __do_softirq
=> irq_exit_rcu
=> do_vint
=> finish_task_switch.isra.0
=> schedule
=> finish_task_switch.isra.0
=> schedule_idle
=> __schedule
=> tick_nohz_restart
=> schedule_idle
=> cpu_startup_entry
=> kernel_init
=> arch_post_acpi_subsys_init
=> start_kernel
=> smpboot_entry

Maybe the issue happened on the CONFIG_UNWINDER_PROLOGUE, but I
haven't dug deep into it ;)

--
Thanks,
JeffXie

2022-09-03 00:46:27

by Qing Zhang

[permalink] [raw]
Subject: Re: [PATCH v2 3/9] LoongArch/ftrace: Add dynamic function tracer support



On 2022/9/1 上午10:59, Jeff Xie wrote:
> On Fri, Aug 26, 2022 at 4:24 PM Qing Zhang <[email protected]> wrote:
>>
>> The compiler has inserted 2 NOPs before the regular function prologue.
>> T series registers are available and safe because of LoongArch psABI.
>>
>> At runtime, replace nop with bl to enable ftrace call and replace bl with
>> nop to disable ftrace call. The bl requires us to save the original RA value,
>> so here it saves RA at t0.
>> details are:
>>
>> | Compiled | Disabled | Enabled |
>> +------------+------------------------+------------------------+
>> | nop | move t0, ra | move t0, ra |
>> | nop | nop | bl ftrace_caller |
>> | func_body | func_body | func_body |
>>
>> The RA value will be recovered by ftrace_regs_entry, and restored into RA
>> before returning to the regular function prologue. When a function is not
>> being traced, the move t0, ra is not harmful.
>>
>> 1) ftrace_make_call, ftrace_make_nop (in kernel/ftrace.c)
>> The two functions turn each recorded call site of filtered functions
>> into a call to ftrace_caller or nops.
>>
>> 2) ftracce_update_ftrace_func (in kernel/ftrace.c)
>> turns the nops at ftrace_call into a call to a generic entry for
>> function tracers.
>>
>> 3) ftrace_caller (in kernel/mcount-dyn.S)
>> The entry where each _mcount call sites calls to once they are
>> filtered to be traced.
>>
[...]
>>
>
> When use the option func_stack_trace for the function tracer, I found a issue:
>
> Steps:
> 1. Enable the function tracer and the option func_stack_trace:
>
> /sys/kernel/tracing # echo blk_update_request > ./set_ftrace_filter
> /sys/kernel/tracing # echo 1 > ./options/func_stack_trace
> /sys/kernel/tracing # echo function > ./current_tracer
>
> 2. Let the blk_update_request() be called.
>
> # mount /dev/vda /tmp
>
>
> 3. cat ./trace
> <idle>-0 [000] ..s1. 126.016445: blk_update_request
> <-blk_mq_end_request
> <idle>-0 [000] ..s1. 126.017937: <stack trace>
> => blk_mq_end_request
>
> We can see only one stack trace.
>
>
> I found the default unwinder(for loongson3_defconfig) is
> CONFIG_UNWINDER_PROLOGUE, if switch it to CONFIG_UNWINDER_GUESS
> it works well:
>
[...]
>
> Maybe the issue happened on the CONFIG_UNWINDER_PROLOGUE, but I
> haven't dug deep into it ;)

Hi, Jeff

Thanks a lot for your feedback!

I fixed it in v3, which was caused by the ftrace_regs_entry assembly not
being considered by the prologue analyze method. :)

regards
-Qing

>