2022-11-06 10:14:22

by Xim

[permalink] [raw]
Subject: [PATCH v4 0/8] Add OPTPROBES feature on RISCV

From: Liao Chang <[email protected]>

From: Liao Chang <[email protected]>

Add jump optimization support for RISC-V.

Replaces ebreak instructions used by normal kprobes with an
auipc+jalr instruction pair, at the aim of suppressing the probe-hit
overhead.

All known optprobe-capable RISC architectures have been using a single
jump or branch instructions while this patch chooses not. RISC-V has a
quite limited jump range (4KB or 2MB) for both its branch and jump
instructions, which prevent optimizations from supporting probes that
spread all over the kernel.

Auipc-jalr instruction pair is introduced with a much wider jump range
(4GB), where auipc loads the upper 12 bits to a free register and jalr
Deaconappends the lower 20 bits to form a 32 bit immediate. Note that
returns from probe handler requires another free register. As kprobes
can appear almost anywhere inside the kernel, the free register should
be found in a generic way, not depending on calling convention or any
other regulations.

The algorithm for finding the free register is inspired by the register
renaming in modern processors. From the perspective of register renaming,
a register could be represented as two different registers if two neighbour
instructions both write to it but no one ever reads. Extending this fact,
a register is considered to be free if there is no read before its next
write in the execution flow. We are free to change its value without
interfering normal execution.

Static analysis shows that 51% instructions of the kernel (default config)
is capable of being replaced i.e. one free register can be found at both
the start and end of replaced instruction pairs while the replaced
instructions can be directly executed.

Contribution:
Chen Guokai invents the algorithm of searching free register, evaluate
the ratio of optimizaion, the basic function support RVI kernel binary.
Liao Chang adds the support for hybrid RVI and RVC kernel binary, fix
some bugs with different kernel configure, refactor out entire feature
into some individual patches.

v4:
Correct the sequence of Signed-off-by and Co-developed-by.

v3:
1. Support of hybrid RVI and RVC kernel binary.
2. Refactor out entire feature into some individual patches.

v2:
1. Adjust comments
2. Remove improper copyright
3. Clean up format issues that is no common practice
4. Extract common definition of instruction decoder
5. Fix race issue in SMP platform.

v1:
Chen Guokai contribute the basic functionality code.

Liao Chang (8):
riscv/kprobe: Prepare the skeleton to implement RISCV OPTPROBES
feature
riscv/kprobe: Allocate detour buffer from module area
riscv/kprobe: Prepare the skeleton to prepare optimized kprobe
riscv/kprobe: Add common RVI and RVC instruction decoder code
riscv/kprobe: Search free register(s) to clobber for 'AUIPC/JALR'
riscv/kprobe: Add code to check if kprobe can be optimized
riscv/kprobe: Prepare detour buffer for optimized kprobe
riscv/kprobe: Patch AUIPC/JALR pair to optimize kprobe

arch/riscv/Kconfig | 1 +
arch/riscv/include/asm/bug.h | 5 +-
arch/riscv/include/asm/kprobes.h | 48 ++
arch/riscv/include/asm/patch.h | 1 +
arch/riscv/kernel/patch.c | 22 +-
arch/riscv/kernel/probes/Makefile | 1 +
arch/riscv/kernel/probes/decode-insn.h | 145 ++++++
arch/riscv/kernel/probes/kprobes.c | 25 +
arch/riscv/kernel/probes/opt.c | 602 ++++++++++++++++++++++
arch/riscv/kernel/probes/opt_trampoline.S | 137 +++++
arch/riscv/kernel/probes/simulate-insn.h | 41 ++
11 files changed, 1023 insertions(+), 5 deletions(-)
create mode 100644 arch/riscv/kernel/probes/opt.c
create mode 100644 arch/riscv/kernel/probes/opt_trampoline.S

--
2.25.1



2022-11-06 10:21:19

by Xim

[permalink] [raw]
Subject: [PATCH v4 4/8] riscv/kprobe: Add common RVI and RVC instruction decoder code

From: Liao Chang <[email protected]>

From: Liao Chang <[email protected]>

This patch add code that can be used to decode RVI and RVC instructions
in searching one register for 'AUIPC/JALR'. As mentioned in previous
patch, kprobe can't be optimized until one free integer register can be
found out to save the jump target, in order to figure out the register
searching, all instructions starts from the kprobe to the last one of
function needs to decode and test if contains one candidate register.

For all RVI instruction format, the position and length of 'rs1', 'rs2'
,'rd' and 'opcode' part are uniform, but the rule of RVC instruction
format is more complicated, so it address a couple of inline functions
to decode rs1/rs2/rd for RVC.

These instruction decoder suppose to be consistent with the RVC and
RV32/RV64G instruction set list specified in the riscv instruction
reference published at August 25, 2022.

Signed-off-by: Liao Chang <[email protected]>
Co-developed-by: Chen Guokai <[email protected]>
Signed-off-by: Chen Guokai <[email protected]>
---
arch/riscv/include/asm/bug.h | 5 +-
arch/riscv/kernel/probes/decode-insn.h | 145 +++++++++++++++++++++++
arch/riscv/kernel/probes/simulate-insn.h | 41 +++++++
3 files changed, 190 insertions(+), 1 deletion(-)

diff --git a/arch/riscv/include/asm/bug.h b/arch/riscv/include/asm/bug.h
index 1aaea81fb141..9c33d3b58225 100644
--- a/arch/riscv/include/asm/bug.h
+++ b/arch/riscv/include/asm/bug.h
@@ -19,11 +19,14 @@
#define __BUG_INSN_32 _UL(0x00100073) /* ebreak */
#define __BUG_INSN_16 _UL(0x9002) /* c.ebreak */

+#define RVI_INSN_LEN 4UL
+#define RVC_INSN_LEN 2UL
+
#define GET_INSN_LENGTH(insn) \
({ \
unsigned long __len; \
__len = ((insn & __INSN_LENGTH_MASK) == __INSN_LENGTH_32) ? \
- 4UL : 2UL; \
+ RVI_INSN_LEN : RVC_INSN_LEN; \
__len; \
})

diff --git a/arch/riscv/kernel/probes/decode-insn.h b/arch/riscv/kernel/probes/decode-insn.h
index 42269a7d676d..1c202b0ac7d4 100644
--- a/arch/riscv/kernel/probes/decode-insn.h
+++ b/arch/riscv/kernel/probes/decode-insn.h
@@ -3,6 +3,7 @@
#ifndef _RISCV_KERNEL_KPROBES_DECODE_INSN_H
#define _RISCV_KERNEL_KPROBES_DECODE_INSN_H

+#include <linux/bitops.h>
#include <asm/sections.h>
#include <asm/kprobes.h>

@@ -15,4 +16,148 @@ enum probe_insn {
enum probe_insn __kprobes
riscv_probe_decode_insn(probe_opcode_t *addr, struct arch_probe_insn *asi);

+static inline u16 rvi_rs1(kprobe_opcode_t opcode)
+{
+ return (u16)((opcode >> 15) & 0x1f);
+}
+
+static inline u16 rvi_rs2(kprobe_opcode_t opcode)
+{
+ return (u16)((opcode >> 20) & 0x1f);
+}
+
+static inline u16 rvi_rd(kprobe_opcode_t opcode)
+{
+ return (u16)((opcode >> 7) & 0x1f);
+}
+
+static inline s32 rvi_branch_imme(kprobe_opcode_t opcode)
+{
+ u32 imme = 0;
+
+ imme |= (((opcode >> 8) & 0xf) << 1) |
+ (((opcode >> 25) & 0x3f) << 5) |
+ (((opcode >> 7) & 0x1) << 11) |
+ (((opcode >> 31) & 0x1) << 12);
+
+ return sign_extend32(imme, 13);
+}
+
+static inline s32 rvi_jal_imme(kprobe_opcode_t opcode)
+{
+ u32 imme = 0;
+
+ imme |= (((opcode >> 21) & 0x3ff) << 1) |
+ (((opcode >> 20) & 0x1) << 11) |
+ (((opcode >> 12) & 0xff) << 12) |
+ (((opcode >> 31) & 0x1) << 20);
+
+ return sign_extend32(imme, 21);
+}
+
+#ifdef CONFIG_RISCV_ISA_C
+static inline u16 rvc_r_rs1(kprobe_opcode_t opcode)
+{
+ return (u16)((opcode >> 2) & 0x1f);
+}
+
+static inline u16 rvc_r_rs2(kprobe_opcode_t opcode)
+{
+ return (u16)((opcode >> 2) & 0x1f);
+}
+
+static inline u16 rvc_r_rd(kprobe_opcode_t opcode)
+{
+ return rvc_r_rs1(opcode);
+}
+
+static inline u16 rvc_i_rs1(kprobe_opcode_t opcode)
+{
+ return (u16)((opcode >> 7) & 0x1f);
+}
+
+static inline u16 rvc_i_rd(kprobe_opcode_t opcode)
+{
+ return rvc_i_rs1(opcode);
+}
+
+static inline u16 rvc_ss_rs2(kprobe_opcode_t opcode)
+{
+ return (u16)((opcode >> 2) & 0x1f);
+}
+
+static inline u16 rvc_l_rd(kprobe_opcode_t opcode)
+{
+ return (u16)((opcode >> 2) & 0x7);
+}
+
+static inline u16 rvc_l_rs(kprobe_opcode_t opcode)
+{
+ return (u16)((opcode >> 7) & 0x7);
+}
+
+static inline u16 rvc_s_rs2(kprobe_opcode_t opcode)
+{
+ return (u16)((opcode >> 2) & 0x7);
+}
+
+static inline u16 rvc_s_rs1(kprobe_opcode_t opcode)
+{
+ return (u16)((opcode >> 7) & 0x7);
+}
+
+static inline u16 rvc_a_rs2(kprobe_opcode_t opcode)
+{
+ return (u16)((opcode >> 2) & 0x7);
+}
+
+static inline u16 rvc_a_rs1(kprobe_opcode_t opcode)
+{
+ return (u16)((opcode >> 7) & 0x7);
+}
+
+static inline u16 rvc_a_rd(kprobe_opcode_t opcode)
+{
+ return rvc_a_rs1(opcode);
+}
+
+static inline u16 rvc_b_rd(kprobe_opcode_t opcode)
+{
+ return (u16)((opcode >> 7) & 0x7);
+}
+
+static inline u16 rvc_b_rs(kprobe_opcode_t opcode)
+{
+ return rvc_b_rd(opcode);
+}
+
+static inline s32 rvc_branch_imme(kprobe_opcode_t opcode)
+{
+ u32 imme = 0;
+
+ imme |= (((opcode >> 3) & 0x3) << 1) |
+ (((opcode >> 10) & 0x3) << 3) |
+ (((opcode >> 2) & 0x1) << 5) |
+ (((opcode >> 5) & 0x3) << 6) |
+ (((opcode >> 12) & 0x1) << 8);
+
+ return sign_extend32(imme, 9);
+}
+
+static inline s32 rvc_jal_imme(kprobe_opcode_t opcode)
+{
+ u32 imme = 0;
+
+ imme |= (((opcode >> 3) & 0x3) << 1) |
+ (((opcode >> 11) & 0x1) << 4) |
+ (((opcode >> 2) & 0x1) << 5) |
+ (((opcode >> 7) & 0x1) << 6) |
+ (((opcode >> 6) & 0x1) << 7) |
+ (((opcode >> 9) & 0x3) << 8) |
+ (((opcode >> 8) & 0x1) << 10) |
+ (((opcode >> 12) & 0x1) << 11);
+
+ return sign_extend32(imme, 12);
+}
+#endif /* CONFIG_RISCV_ISA_C */
#endif /* _RISCV_KERNEL_KPROBES_DECODE_INSN_H */
diff --git a/arch/riscv/kernel/probes/simulate-insn.h b/arch/riscv/kernel/probes/simulate-insn.h
index cb6ff7dccb92..74d8c1ba9064 100644
--- a/arch/riscv/kernel/probes/simulate-insn.h
+++ b/arch/riscv/kernel/probes/simulate-insn.h
@@ -37,6 +37,40 @@ __RISCV_INSN_FUNCS(c_jalr, 0xf007, 0x9002);
__RISCV_INSN_FUNCS(c_beqz, 0xe003, 0xc001);
__RISCV_INSN_FUNCS(c_bnez, 0xe003, 0xe001);
__RISCV_INSN_FUNCS(c_ebreak, 0xffff, 0x9002);
+/* RVC(S) instructions contain rs1 and rs2 */
+__RISCV_INSN_FUNCS(c_sq, 0xe003, 0xa000);
+__RISCV_INSN_FUNCS(c_sw, 0xe003, 0xc000);
+__RISCV_INSN_FUNCS(c_sd, 0xe003, 0xe000);
+/* RVC(A) instructions contain rs1 and rs2 */
+__RISCV_INSN_FUNCS(c_sub, 0xfc03, 0x8c01);
+__RISCV_INSN_FUNCS(c_subw, 0xfc43, 0x9c01);
+/* RVC(L) instructions contain rs1 */
+__RISCV_INSN_FUNCS(c_lq, 0xe003, 0x2000);
+__RISCV_INSN_FUNCS(c_lw, 0xe003, 0x4000);
+__RISCV_INSN_FUNCS(c_ld, 0xe003, 0x6000);
+/* RVC(I) instructions contain rs1 */
+__RISCV_INSN_FUNCS(c_addi, 0xe003, 0x0001);
+__RISCV_INSN_FUNCS(c_addiw, 0xe003, 0x2001);
+__RISCV_INSN_FUNCS(c_addi16sp, 0xe183, 0x6101);
+__RISCV_INSN_FUNCS(c_slli, 0xe003, 0x0002);
+/* RVC(B) instructions contain rs1 */
+__RISCV_INSN_FUNCS(c_sri, 0xe803, 0x8001);
+__RISCV_INSN_FUNCS(c_andi, 0xec03, 0x8801);
+/* RVC(SS) instructions contain rs2 */
+__RISCV_INSN_FUNCS(c_sqsp, 0xe003, 0xa002);
+__RISCV_INSN_FUNCS(c_swsp, 0xe003, 0xc002);
+__RISCV_INSN_FUNCS(c_sdsp, 0xe003, 0xe002);
+/* RVC(R) instructions contain rs2 and rd */
+__RISCV_INSN_FUNCS(c_mv, 0xe003, 0x8002);
+/* RVC(I) instructions contain sp and rd */
+__RISCV_INSN_FUNCS(c_lqsp, 0xe003, 0x2002);
+__RISCV_INSN_FUNCS(c_lwsp, 0xe003, 0x4002);
+__RISCV_INSN_FUNCS(c_ldsp, 0xe003, 0x6002);
+/* RVC(CW) instructions contain sp and rd */
+__RISCV_INSN_FUNCS(c_addi4spn, 0xe003, 0x0000);
+/* RVC(I) instructions contain rd */
+__RISCV_INSN_FUNCS(c_li, 0xe003, 0x4001);
+__RISCV_INSN_FUNCS(c_lui, 0xe003, 0x6001);

__RISCV_INSN_FUNCS(auipc, 0x7f, 0x17);
__RISCV_INSN_FUNCS(branch, 0x7f, 0x63);
@@ -44,4 +78,11 @@ __RISCV_INSN_FUNCS(branch, 0x7f, 0x63);
__RISCV_INSN_FUNCS(jal, 0x7f, 0x6f);
__RISCV_INSN_FUNCS(jalr, 0x707f, 0x67);

+__RISCV_INSN_FUNCS(arith_rr, 0x77, 0x33);
+__RISCV_INSN_FUNCS(arith_ri, 0x77, 0x13);
+__RISCV_INSN_FUNCS(lui, 0x7f, 0x37);
+__RISCV_INSN_FUNCS(load, 0x7f, 0x03);
+__RISCV_INSN_FUNCS(store, 0x7f, 0x23);
+__RISCV_INSN_FUNCS(amo, 0x7f, 0x2f);
+
#endif /* _RISCV_KERNEL_PROBES_SIMULATE_INSN_H */
--
2.25.1


2022-11-06 10:24:44

by Xim

[permalink] [raw]
Subject: [PATCH v4 6/8] riscv/kprobe: Add code to check if kprobe can be optimized

From: Liao Chang <[email protected]>

From: Liao Chang <[email protected]>

This patch add code to check if kprobe can be optimized, regular kprobe
replaces single instruction with EBREAK or C.EBREAK, it just requires
the instrumented instruction support execute out-of-line or simulation,
while optimized kprobe patch AUIPC/JALR pair to do a long jump, it makes
everything more compilated, espeically for kernel that is hybrid RVI and
RVC binary, although AUIPC/JALR just need 8 bytes space, the bytes to
patch are 10 bytes long at worst case to ensure no RVI would be
truncated, so there are four methods to patch optimized kprobe.

- Replace 2 RVI with AUIPC/JALR.
- Replace 4 RVC with AUIPC/JALR.
- Replace 2 RVC and 1 RVI with AUIPC/JALR.
- Replace 3 RVC and 1 RVI with AUIPC/JALR, and patch C.NOP into last
two bytes for alignment.

So it has to find out a instruction window large enough to patch
AUIPC/JALR from the address instrumented breakpoint, meanwhile, ensure
no instruction has chance to jump into the range of patched window.

Signed-off-by: Liao Chang <[email protected]>
Co-developed-by: Chen Guokai <[email protected]>
Signed-off-by: Chen Guokai <[email protected]>
---
arch/riscv/kernel/probes/opt.c | 99 ++++++++++++++++++++++++++++++++--
1 file changed, 94 insertions(+), 5 deletions(-)

diff --git a/arch/riscv/kernel/probes/opt.c b/arch/riscv/kernel/probes/opt.c
index 6d23c843832e..876bec539554 100644
--- a/arch/riscv/kernel/probes/opt.c
+++ b/arch/riscv/kernel/probes/opt.c
@@ -271,15 +271,103 @@ static void find_free_registers(struct kprobe *kp, struct optimized_kprobe *op,
*ra = (kw == 1UL) ? 0 : __builtin_ctzl(kw & ~1UL);
}

+static bool insn_jump_into_range(unsigned long addr, unsigned long start,
+ unsigned long end)
+{
+ kprobe_opcode_t insn = *(kprobe_opcode_t *)addr;
+ unsigned long target, offset = GET_INSN_LENGTH(insn);
+
+#ifdef CONFIG_RISCV_ISA_C
+ if (offset == RVC_INSN_LEN) {
+ if (riscv_insn_is_c_beqz(insn) || riscv_insn_is_c_bnez(insn))
+ target = addr + rvc_branch_imme(insn);
+ else if (riscv_insn_is_c_jal(insn) || riscv_insn_is_c_j(insn))
+ target = addr + rvc_jal_imme(insn);
+ else
+ target = addr + offset;
+ return (target >= start) && (target < end);
+ }
+#endif
+
+ if (riscv_insn_is_branch(insn))
+ target = addr + rvi_branch_imme(insn);
+ else if (riscv_insn_is_jal(insn))
+ target = addr + rvi_jal_imme(insn);
+ else
+ target = addr + offset;
+ return (target >= start) && (target < end);
+}
+
+static int search_copied_insn(unsigned long paddr, struct optimized_kprobe *op)
+{
+ int i = 1;
+ unsigned long offset = GET_INSN_LENGTH(*(kprobe_opcode_t *)paddr);
+
+ while ((i++ < MAX_COPIED_INSN) && (offset < 2 * RVI_INSN_LEN)) {
+ if (riscv_probe_decode_insn((probe_opcode_t *)paddr + offset,
+ NULL) != INSN_GOOD)
+ return -1;
+ offset += GET_INSN_LENGTH(*(kprobe_opcode_t *)(paddr + offset));
+ }
+
+ op->optinsn.length = offset;
+ return 0;
+}
+
/*
- * If two free registers can be found at the beginning of both
- * the start and the end of replaced code, it can be optimized
- * Also, in-function jumps need to be checked to make sure that
- * there is no jump to the second instruction to be replaced
+ * The kprobe can be optimized when no in-function jump reaches to the
+ * instructions replaced by optimized jump instructions(AUIPC/JALR).
*/
static bool can_optimize(unsigned long paddr, struct optimized_kprobe *op)
{
- return false;
+ int ret;
+ unsigned long addr, size = 0, offset = 0;
+ struct kprobe *kp = get_kprobe((kprobe_opcode_t *)paddr);
+
+ /*
+ * Skip optimization if kprobe has been disarmed or instrumented
+ * instruction support XOI.
+ */
+ if (!kp || (riscv_probe_decode_insn(&kp->opcode, NULL) != INSN_GOOD))
+ return false;
+
+ /*
+ * Find a instruction window large enough to contain a pair
+ * of AUIPC/JALR, and ensure each instruction in this window
+ * supports XOI.
+ */
+ ret = search_copied_insn(paddr, op);
+ if (ret)
+ return false;
+
+ if (!kallsyms_lookup_size_offset(paddr, &size, &offset))
+ return false;
+
+ /* Check there is enough space for relative jump(AUIPC/JALR) */
+ if (size - offset <= op->optinsn.length)
+ return false;
+
+ /*
+ * Decode instructions until function end, check any instruction
+ * don't jump into the window used to emit optprobe(AUIPC/JALR).
+ */
+ addr = paddr - offset;
+ while (addr < paddr) {
+ if (insn_jump_into_range(addr, paddr + RVC_INSN_LEN,
+ paddr + op->optinsn.length))
+ return false;
+ addr += GET_INSN_LENGTH(*(kprobe_opcode_t *)addr);
+ }
+
+ addr = paddr + op->optinsn.length;
+ while (addr < paddr - offset + size) {
+ if (insn_jump_into_range(addr, paddr + RVC_INSN_LEN,
+ paddr + op->optinsn.length))
+ return false;
+ addr += GET_INSN_LENGTH(*(kprobe_opcode_t *)addr);
+ }
+
+ return true;
}

int arch_prepared_optinsn(struct arch_optimized_insn *optinsn)
--
2.25.1


2022-11-06 10:27:59

by Xim

[permalink] [raw]
Subject: [PATCH v4 5/8] riscv/kprobe: Search free register(s) to clobber for 'AUIPC/JALR'

From: Liao Chang <[email protected]>

From: Liao Chang <[email protected]>

This patch implement the algorithm of searching free register(s) to
form a long-jump instruction pair.

AUIPC/JALR instruction pair is introduced with a much wider jump range
(4GB), where auipc loads the upper 20 bits to a free register and jalr
appends the lower 12 bits to form a 32 bit immediate. Since kprobes can
be instrumented at anywhere in kernel space, hence the free register
should be found in a generic way, not depending on the calling convention
or any other regulations.

The algorithm for finding the free register is inspired by the register
renaming in modern processors. From the perspective of register renaming,
a register could be represented as two different registers if two neighbour
instructions both write to it but no one ever reads. Extending this fact,
a register is considered to be free if there is no read before its next
write in the execution flow. We are free to change its value without
interfering normal execution.

In order to do jump optimization, it needs to search two free registers,
the first one is used to form AUIPC/JALR jumping to detour buffer, the
second one is used to form JR jumping back from detour buffer. If first
one never been updated by any instructions replaced by 'AUIPC/JALR',
both register supposes to the same one.

Let's use the example below to explain how the algorithm work. Given
kernel is RVI and RCV hybrid binary, and one kprobe is instrumented at
the entry of function idle_dummy.

Before Optimized Detour buffer
<idle_dummy>: ...
#1 add sp,sp,-16 auipc a0, #? add sp,sp,-16
#2 sd s0,8(sp) sd s0,8(sp)
#3 addi s0,sp,16 jalr a0, #?(a0) addi s0,sp,16
#4 ld s0,8(sp) ld s0,8(sp)
#5 li a0,0 li a0,0 auipc a0, #?
#6 addi sp,sp,16 addi sp,sp,16 jr x0, #?(a0)
#7 ret ret

For regular kprobe, it is trival to replace the first instruction with
C.EREABK, no more instruction and register will be clobber, in order to
optimize kprobe with long-jump, it used to patch the first 8 bytes with
AUIPC/JALR, and a0 will be chosen to save the address jumping to,
because from #1 to #7, a0 is the only one register that satifies two
conditions: (1) No read before write (2) Never been updated in detour
buffer. While s0 has been used as the source register at #2, so it is
not free to clobber.

The searching starts from the kprobe and stop at the last instruction of
function or the first branch/jump instruction, it decodes out the 'rs'
and 'rd' part of each visited instruction. If the 'rd' never been read
before, then record it to bitmask 'write'; if the 'rs' never been
written before, then record it to another bitmask 'read'. When searching
stops, the remaining bits of 'write' are the free registers to form
AUIPC/JALR or JR.

Signed-off-by: Liao Chang <[email protected]>
Co-developed-by: Chen Guokai <[email protected]>
Signed-off-by: Chen Guokai <[email protected]>
---
arch/riscv/kernel/probes/opt.c | 225 ++++++++++++++++++++++++++++++++-
1 file changed, 224 insertions(+), 1 deletion(-)

diff --git a/arch/riscv/kernel/probes/opt.c b/arch/riscv/kernel/probes/opt.c
index e4a619c2077e..6d23c843832e 100644
--- a/arch/riscv/kernel/probes/opt.c
+++ b/arch/riscv/kernel/probes/opt.c
@@ -12,6 +12,9 @@
#include <asm/kprobes.h>
#include <asm/patch.h>

+#include "simulate-insn.h"
+#include "decode-insn.h"
+
static inline int in_auipc_jalr_range(long val)
{
#ifdef CONFIG_ARCH_RV32I
@@ -37,15 +40,235 @@ static void prepare_detour_buffer(kprobe_opcode_t *code, kprobe_opcode_t *slot,
{
}

+/* Registers the first usage of which is the destination of instruction */
+#define WRITE_ON(reg) \
+ (*write |= (((*read >> (reg)) ^ 1UL) & 1) << (reg))
+/* Registers the first usage of which is the source of instruction */
+#define READ_ON(reg) \
+ (*read |= (((*write >> (reg)) ^ 1UL) & 1) << (reg))
+
/*
* In RISC-V ISA, AUIPC/JALR clobber one register to form target address,
* by inspired by register renaming in OoO processor, this involves search
* backwards that is not previously used as a source register and is used
* as a destination register before any branch or jump instruction.
*/
+static void arch_find_register(unsigned long start, unsigned long end,
+ unsigned long *write, unsigned long *read)
+{
+ kprobe_opcode_t insn;
+ unsigned long addr, offset = 0UL;
+
+ for (addr = start; addr < end; addr += offset) {
+ insn = *(kprobe_opcode_t *)addr;
+ offset = GET_INSN_LENGTH(insn);
+
+#ifdef CONFIG_RISCV_ISA_C
+ if (offset == RVI_INSN_LEN)
+ goto is_rvi;
+
+ insn &= __COMPRESSED_INSN_MASK;
+ /* Stop searching until any control transfer instruction */
+ if (riscv_insn_is_c_ebreak(insn) || riscv_insn_is_c_j(insn))
+ break;
+
+ if (riscv_insn_is_c_jal(insn)) {
+ /* The rd of C.JAL is x1 by default */
+ WRITE_ON(1);
+ break;
+ }
+
+ if (riscv_insn_is_c_jr(insn)) {
+ READ_ON(rvc_r_rs1(insn));
+ break;
+ }
+
+ if (riscv_insn_is_c_jalr(insn)) {
+ READ_ON(rvc_r_rs1(insn));
+ /* The rd of C.JALR is x1 by default */
+ WRITE_ON(1);
+ break;
+ }
+
+ if (riscv_insn_is_c_beqz(insn) || riscv_insn_is_c_bnez(insn)) {
+ READ_ON(rvc_b_rs(insn));
+ break;
+ }
+
+ /*
+ * Decode RVC instructions that encode integer registers, try
+ * to find out some destination register, the number of which
+ * are equal with 'least' and never be used as source register.
+ */
+ if (riscv_insn_is_c_sub(insn) || riscv_insn_is_c_subw(insn)) {
+ READ_ON(rvc_a_rs1(insn));
+ READ_ON(rvc_a_rs2(insn));
+ continue;
+ } else if (riscv_insn_is_c_sq(insn) ||
+ riscv_insn_is_c_sw(insn) ||
+ riscv_insn_is_c_sd(insn)) {
+ READ_ON(rvc_s_rs1(insn));
+ READ_ON(rvc_s_rs2(insn));
+ continue;
+ } else if (riscv_insn_is_c_addi16sp(insn) ||
+ riscv_insn_is_c_addi(insn) ||
+ riscv_insn_is_c_addiw(insn) ||
+ riscv_insn_is_c_slli(insn)) {
+ READ_ON(rvc_i_rs1(insn));
+ continue;
+ } else if (riscv_insn_is_c_sri(insn) ||
+ riscv_insn_is_c_andi(insn)) {
+ READ_ON(rvc_b_rs(insn));
+ continue;
+ } else if (riscv_insn_is_c_sqsp(insn) ||
+ riscv_insn_is_c_swsp(insn) ||
+ riscv_insn_is_c_sdsp(insn)) {
+ READ_ON(rvc_ss_rs2(insn));
+ /* The rs2 of C.SQSP/SWSP/SDSP are x2 by default */
+ READ_ON(2);
+ continue;
+ } else if (riscv_insn_is_c_mv(insn)) {
+ READ_ON(rvc_r_rs2(insn));
+ WRITE_ON(rvc_r_rd(insn));
+ } else if (riscv_insn_is_c_addi4spn(insn)) {
+ /* The rs of C.ADDI4SPN is x2 by default */
+ READ_ON(2);
+ WRITE_ON(rvc_l_rd(insn));
+ } else if (riscv_insn_is_c_lq(insn) ||
+ riscv_insn_is_c_lw(insn) ||
+ riscv_insn_is_c_ld(insn)) {
+ /* FIXME: c.lw/c.ld share opcode with c.flw/c.fld */
+ READ_ON(rvc_l_rs(insn));
+ WRITE_ON(rvc_l_rd(insn));
+ } else if (riscv_insn_is_c_lqsp(insn) ||
+ riscv_insn_is_c_lwsp(insn) ||
+ riscv_insn_is_c_ldsp(insn)) {
+ /*
+ * FIXME: c.lwsp/c.ldsp share opcode with c.flwsp/c.fldsp
+ * The rs of C.LQSP/C.LWSP/C.LDSP is x2 by default.
+ */
+ READ_ON(2);
+ WRITE_ON(rvc_i_rd(insn));
+ } else if (riscv_insn_is_c_li(insn) ||
+ riscv_insn_is_c_lui(insn)) {
+ WRITE_ON(rvc_i_rd(insn));
+ }
+
+ if ((*write > 1UL) && __builtin_ctzl(*write & ~1UL))
+ return;
+is_rvi:
+#endif
+ /* Stop searching until any control transfer instruction */
+ if (riscv_insn_is_branch(insn)) {
+ READ_ON(rvi_rs1(insn));
+ READ_ON(rvi_rs2(insn));
+ break;
+ }
+
+ if (riscv_insn_is_jal(insn)) {
+ WRITE_ON(rvi_rd(insn));
+ break;
+ }
+
+ if (riscv_insn_is_jalr(insn)) {
+ READ_ON(rvi_rs1(insn));
+ WRITE_ON(rvi_rd(insn));
+ break;
+ }
+
+ if (riscv_insn_is_system(insn)) {
+ /* csrrw, csrrs, csrrc */
+ if (rvi_rs1(insn))
+ READ_ON(rvi_rs1(insn));
+ /* csrrwi, csrrsi, csrrci, csrrw, csrrs, csrrc */
+ if (rvi_rd(insn))
+ WRITE_ON(rvi_rd(insn));
+ break;
+ }
+
+ /*
+ * Decode RVC instructions that has rd and rs, try to find out
+ * some rd, the number of which are equal with 'least' and never
+ * be used as rs.
+ */
+ if (riscv_insn_is_lui(insn) || riscv_insn_is_auipc(insn)) {
+ WRITE_ON(rvi_rd(insn));
+ } else if (riscv_insn_is_arith_ri(insn) ||
+ riscv_insn_is_load(insn)) {
+ READ_ON(rvi_rs1(insn));
+ WRITE_ON(rvi_rd(insn));
+ } else if (riscv_insn_is_arith_rr(insn) ||
+ riscv_insn_is_store(insn) ||
+ riscv_insn_is_amo(insn)) {
+ READ_ON(rvi_rs1(insn));
+ READ_ON(rvi_rs2(insn));
+ WRITE_ON(rvi_rd(insn));
+ }
+
+ if ((*write > 1UL) && __builtin_ctzl(*write & ~1UL))
+ return;
+ }
+}
+
static void find_free_registers(struct kprobe *kp, struct optimized_kprobe *op,
- int *rd1, int *rd2)
+ int *rd, int *ra)
{
+ unsigned long start, end;
+ /*
+ * Searching algorithm explanation:
+ *
+ * 1. Define two types of instruction area firstly:
+ *
+ * +-----+
+ * + +
+ * + + ---> instrunctions modified by optprobe, named 'O-Area'.
+ * + +
+ * +-----+
+ * + +
+ * + + ---> instructions after optprobe, named 'K-Area'.
+ * + +
+ * + ~ +
+ *
+ * 2. There are two usages for each GPR in given instruction area.
+ *
+ * - W: GPR is used as the RD oprand at first emergence.
+ * - R: GPR is used as the RS oprand at first emergence.
+ *
+ * Then there are 4 different usages for each GPR totally:
+ *
+ * 1. Used as W in O-Area, Used as W in K-Area.
+ * 2. Used as W in O-Area, Used as R in K-Area.
+ * 3. Used as R in O-Area, Used as W in K-Area.
+ * 4. Used as R in O-Area, Used as R in K-Area.
+ *
+ * All registers satisfy #1 or #3 could be chosen to form 'AUIPC/JALR'
+ * jumping to detour buffer.
+ *
+ * All registers satisfy #1 or #2, could be chosen to form 'JR' jumping
+ * back from detour buffer.
+ */
+ unsigned long kw = 0UL, kr = 0UL, ow = 0UL, or = 0UL;
+
+ /* Search one free register used to form AUIPC/JALR */
+ start = (unsigned long)&kp->opcode;
+ end = start + GET_INSN_LENGTH(kp->opcode);
+ arch_find_register(start, end, &ow, &or);
+
+ start = (unsigned long)kp->addr + GET_INSN_LENGTH(kp->opcode);
+ end = (unsigned long)kp->addr + op->optinsn.length;
+ arch_find_register(start, end, &ow, &or);
+
+ /* Search one free register used to form JR */
+ arch_find_register(end, (unsigned long)_end, &kw, &kr);
+
+ if ((kw & ow) > 1UL) {
+ *rd = __builtin_ctzl((kw & ow) & ~1UL);
+ *ra = *rd;
+ return;
+ }
+
+ *rd = ((kw | ow) == 1UL) ? 0 : __builtin_ctzl((kw | ow) & ~1UL);
+ *ra = (kw == 1UL) ? 0 : __builtin_ctzl(kw & ~1UL);
}

/*
--
2.25.1


2022-11-06 10:28:11

by Xim

[permalink] [raw]
Subject: [PATCH v4 7/8] riscv/kprobe: Prepare detour buffer for optimized kprobe

From: Liao Chang <[email protected]>

From: Liao Chang <[email protected]>

This patch introduce code to prepare instruction slot for optimized
kprobe, the instruction slot for regular kprobe just records two
instructions, first one is the original instruction replaced by EBREAK,
the second one is EBREAK for single-step. While instruction slot for
optimized kprobe is larger, beside execute instruction out-of-line, it
also contains a standalone stackframe for calling kprobe handler.

All optimized instruction slots consis of 5 major parts, which copied
from the assembly code template in opt_trampoline.S.

SAVE REGS
CALL optimized_callback
RESTORE REGS
EXECUTE INSNS OUT-OF-LINE
RETURN BACK

Although most instructions in each slot are same, these slots still have
a bit difference in their payload, it is result from three parts:

- 'CALL optimized_callback', the relative offset for 'call'
instruction is different for each kprobe.
- 'EXECUTE INSN OUT-OF-LINE', no doubt.
- 'RETURN BACK', the chosen free register is reused here as the
destination register of jumping back.

So it also need to customize the slot payload for each optimized kprobe.

Signed-off-by: Liao Chang <[email protected]>
Co-developed-by: Chen Guokai <[email protected]>
Signed-off-by: Chen Guokai <[email protected]>
---
arch/riscv/include/asm/kprobes.h | 16 +++
arch/riscv/kernel/probes/opt.c | 75 +++++++++++++
arch/riscv/kernel/probes/opt_trampoline.S | 125 ++++++++++++++++++++++
3 files changed, 216 insertions(+)

diff --git a/arch/riscv/include/asm/kprobes.h b/arch/riscv/include/asm/kprobes.h
index 22b73a2fd1fd..a9ef864f7225 100644
--- a/arch/riscv/include/asm/kprobes.h
+++ b/arch/riscv/include/asm/kprobes.h
@@ -48,10 +48,26 @@ void __kprobes *trampoline_probe_handler(struct pt_regs *regs);
/* optinsn template addresses */
extern __visible kprobe_opcode_t optprobe_template_entry[];
extern __visible kprobe_opcode_t optprobe_template_end[];
+extern __visible kprobe_opcode_t optprobe_template_save[];
+extern __visible kprobe_opcode_t optprobe_template_call[];
+extern __visible kprobe_opcode_t optprobe_template_insn[];
+extern __visible kprobe_opcode_t optprobe_template_return[];

#define MAX_OPTINSN_SIZE \
((unsigned long)optprobe_template_end - \
(unsigned long)optprobe_template_entry)
+#define DETOUR_SAVE_OFFSET \
+ ((unsigned long)optprobe_template_save - \
+ (unsigned long)optprobe_template_entry)
+#define DETOUR_CALL_OFFSET \
+ ((unsigned long)optprobe_template_call - \
+ (unsigned long)optprobe_template_entry)
+#define DETOUR_INSN_OFFSET \
+ ((unsigned long)optprobe_template_insn - \
+ (unsigned long)optprobe_template_entry)
+#define DETOUR_RETURN_OFFSET \
+ ((unsigned long)optprobe_template_return - \
+ (unsigned long)optprobe_template_entry)

/*
* For RVI and RVC hybird encoding kernel, althought long jump just needs
diff --git a/arch/riscv/kernel/probes/opt.c b/arch/riscv/kernel/probes/opt.c
index 876bec539554..77248ed7d4e8 100644
--- a/arch/riscv/kernel/probes/opt.c
+++ b/arch/riscv/kernel/probes/opt.c
@@ -11,9 +11,37 @@
#include <linux/kprobes.h>
#include <asm/kprobes.h>
#include <asm/patch.h>
+#include <asm/asm-offsets.h>

#include "simulate-insn.h"
#include "decode-insn.h"
+#include "../../net/bpf_jit.h"
+
+static void
+optimized_callback(struct optimized_kprobe *op, struct pt_regs *regs)
+{
+ unsigned long flags;
+ struct kprobe_ctlblk *kcb;
+
+ /* Save skipped registers */
+ regs->epc = (unsigned long)op->kp.addr;
+ regs->orig_a0 = ~0UL;
+
+ local_irq_save(flags);
+ kcb = get_kprobe_ctlblk();
+
+ if (kprobe_running()) {
+ kprobes_inc_nmissed_count(&op->kp);
+ } else {
+ __this_cpu_write(current_kprobe, &op->kp);
+ kcb->kprobe_status = KPROBE_HIT_ACTIVE;
+ opt_pre_handler(&op->kp, regs);
+ __this_cpu_write(current_kprobe, NULL);
+ }
+ local_irq_restore(flags);
+}
+
+NOKPROBE_SYMBOL(optimized_callback)

static inline int in_auipc_jalr_range(long val)
{
@@ -30,6 +58,11 @@ static inline int in_auipc_jalr_range(long val)
#endif
}

+#define DETOUR_ADDR(code, offs) \
+ ((void *)((unsigned long)(code) + (offs)))
+#define DETOUR_INSN(code, offs) \
+ (*(kprobe_opcode_t *)((unsigned long)(code) + (offs)))
+
/*
* Copy optprobe assembly code template into detour buffer and modify some
* instructions for each kprobe.
@@ -38,6 +71,49 @@ static void prepare_detour_buffer(kprobe_opcode_t *code, kprobe_opcode_t *slot,
int rd, struct optimized_kprobe *op,
kprobe_opcode_t opcode)
{
+ long offs;
+ unsigned long data;
+
+ memcpy(code, optprobe_template_entry, MAX_OPTINSN_SIZE);
+
+ /* Step1: record optimized_kprobe pointer into detour buffer */
+ memcpy(DETOUR_ADDR(code, DETOUR_SAVE_OFFSET), &op, sizeof(op));
+
+ /*
+ * Step2
+ * auipc ra, 0 --> aupic ra, HI20.{optimized_callback - pc}
+ * jalr ra, 0(ra) --> jalr ra, LO12.{optimized_callback - pc}(ra)
+ */
+ offs = (unsigned long)&optimized_callback -
+ (unsigned long)DETOUR_ADDR(slot, DETOUR_CALL_OFFSET);
+ DETOUR_INSN(code, DETOUR_CALL_OFFSET) =
+ rv_auipc(1, (offs + (1 << 11)) >> 12);
+ DETOUR_INSN(code, DETOUR_CALL_OFFSET + 0x4) =
+ rv_jalr(1, 1, offs & 0xFFF);
+
+ /* Step3: copy replaced instructions into detour buffer */
+ memcpy(DETOUR_ADDR(code, DETOUR_INSN_OFFSET), op->kp.addr,
+ op->optinsn.length);
+ memcpy(DETOUR_ADDR(code, DETOUR_INSN_OFFSET), &opcode,
+ GET_INSN_LENGTH(opcode));
+
+ /* Step4: record return address of long jump into detour buffer */
+ data = (unsigned long)op->kp.addr + op->optinsn.length;
+ memcpy(DETOUR_ADDR(code, DETOUR_RETURN_OFFSET), &data, sizeof(data));
+
+ /*
+ * Step5
+ * auipc ra, 0 --> auipc rd, 0
+ * ld/w ra, -4(ra) --> ld/w rd, -8(rd)
+ * jalr x0, 0(ra) --> jalr x0, 0(rd)
+ */
+ DETOUR_INSN(code, DETOUR_RETURN_OFFSET + 0x8) = rv_auipc(rd, 0);
+#if __riscv_xlen == 32
+ DETOUR_INSN(code, DETOUR_RETURN_OFFSET + 0xC) = rv_lw(rd, -8, rd);
+#else
+ DETOUR_INSN(code, DETOUR_RETURN_OFFSET + 0xC) = rv_ld(rd, -8, rd);
+#endif
+ DETOUR_INSN(code, DETOUR_RETURN_OFFSET + 0x10) = rv_jalr(0, rd, 0);
}

/* Registers the first usage of which is the destination of instruction */
diff --git a/arch/riscv/kernel/probes/opt_trampoline.S b/arch/riscv/kernel/probes/opt_trampoline.S
index 16160c4367ff..75e34e373cf2 100644
--- a/arch/riscv/kernel/probes/opt_trampoline.S
+++ b/arch/riscv/kernel/probes/opt_trampoline.S
@@ -1,12 +1,137 @@
/* SPDX-License-Identifier: GPL-2.0-only */
/*
* Copyright (C) 2022 Guokai Chen
+ * Copyright (C) 2022 Liao, Chang <[email protected]>
*/

#include <linux/linkage.h>

+#include <asm/asm.h>
#incldue <asm/csr.h>
#include <asm/asm-offsets.h>

SYM_ENTRY(optprobe_template_entry, SYM_L_GLOBAL, SYM_A_NONE)
+ addi sp, sp, -(PT_SIZE_ON_STACK)
+ REG_S x1, PT_RA(sp)
+ REG_S x2, PT_SP(sp)
+ REG_S x3, PT_GP(sp)
+ REG_S x4, PT_TP(sp)
+ REG_S x5, PT_T0(sp)
+ REG_S x6, PT_T1(sp)
+ REG_S x7, PT_T2(sp)
+ REG_S x8, PT_S0(sp)
+ REG_S x9, PT_S1(sp)
+ REG_S x10, PT_A0(sp)
+ REG_S x11, PT_A1(sp)
+ REG_S x12, PT_A2(sp)
+ REG_S x13, PT_A3(sp)
+ REG_S x14, PT_A4(sp)
+ REG_S x15, PT_A5(sp)
+ REG_S x16, PT_A6(sp)
+ REG_S x17, PT_A7(sp)
+ REG_S x18, PT_S2(sp)
+ REG_S x19, PT_S3(sp)
+ REG_S x20, PT_S4(sp)
+ REG_S x21, PT_S5(sp)
+ REG_S x22, PT_S6(sp)
+ REG_S x23, PT_S7(sp)
+ REG_S x24, PT_S8(sp)
+ REG_S x25, PT_S9(sp)
+ REG_S x26, PT_S10(sp)
+ REG_S x27, PT_S11(sp)
+ REG_S x28, PT_T3(sp)
+ REG_S x29, PT_T4(sp)
+ REG_S x30, PT_T5(sp)
+ REG_S x31, PT_T6(sp)
+ /* Update fp is friendly for stacktrace */
+ addi s0, sp, (PT_SIZE_ON_STACK)
+ j 1f
+
+SYM_ENTRY(optprobe_template_save, SYM_L_GLOBAL, SYM_A_NONE)
+ /*
+ * Step1:
+ * Filled with the pointer to optimized_kprobe data
+ */
+ .dword 0
+1:
+ /* Load optimize_kprobe pointer from .dword below */
+ auipc a0, 0
+ REG_L a0, -8(a0)
+ add a1, sp, x0
+
+SYM_ENTRY(optprobe_template_call, SYM_L_GLOBAL, SYM_A_NONE)
+ /*
+ * Step2:
+ * <IMME> of AUIPC/JALR are modified to the offset to optimized_callback
+ * jump target is loaded from above .dword.
+ */
+ auipc ra, 0
+ jalr ra, 0(ra)
+
+ REG_L x1, PT_RA(sp)
+ REG_L x3, PT_GP(sp)
+ REG_L x4, PT_TP(sp)
+ REG_L x5, PT_T0(sp)
+ REG_L x6, PT_T1(sp)
+ REG_L x7, PT_T2(sp)
+ REG_L x8, PT_S0(sp)
+ REG_L x9, PT_S1(sp)
+ REG_L x10, PT_A0(sp)
+ REG_L x11, PT_A1(sp)
+ REG_L x12, PT_A2(sp)
+ REG_L x13, PT_A3(sp)
+ REG_L x14, PT_A4(sp)
+ REG_L x15, PT_A5(sp)
+ REG_L x16, PT_A6(sp)
+ REG_L x17, PT_A7(sp)
+ REG_L x18, PT_S2(sp)
+ REG_L x19, PT_S3(sp)
+ REG_L x20, PT_S4(sp)
+ REG_L x21, PT_S5(sp)
+ REG_L x22, PT_S6(sp)
+ REG_L x23, PT_S7(sp)
+ REG_L x24, PT_S8(sp)
+ REG_L x25, PT_S9(sp)
+ REG_L x26, PT_S10(sp)
+ REG_L x27, PT_S11(sp)
+ REG_L x28, PT_T3(sp)
+ REG_L x29, PT_T4(sp)
+ REG_L x30, PT_T5(sp)
+ REG_L x31, PT_T6(sp)
+ REG_L x2, PT_SP(sp)
+ addi sp, sp, (PT_SIZE_ON_STACK)
+
+SYM_ENTRY(optprobe_template_insn, SYM_L_GLOBAL, SYM_A_NONE)
+ /*
+ * Step3:
+ * NOPS will be replaced by the probed instruction, at worst case 3 RVC
+ * and 1 RVI instructions is about to execute out of line.
+ */
+ nop
+ nop
+ nop
+ nop
+ nop
+ nop
+ nop
+ nop
+ nop
+ nop
+ j 2f
+
+SYM_ENTRY(optprobe_template_return, SYM_L_GLOBAL, SYM_A_NONE)
+ /*
+ * Step4:
+ * Filled with the return address of long jump(AUIPC/JALR)
+ */
+ .dword 0
+2:
+ /*
+ * Step5:
+ * The <RA> of AUIPC/LD/JALR will be replaced for each kprobe,
+ * used to read return address saved in .dword above.
+ */
+ auipc ra, 0
+ REG_L ra, -8(ra)
+ jalr x0, 0(ra)
SYM_ENTRY(optprobe_template_end, SYM_L_GLOBAL, SYM_A_NONE)
--
2.25.1


2022-11-07 17:08:38

by Björn Töpel

[permalink] [raw]
Subject: Re: [PATCH v4 6/8] riscv/kprobe: Add code to check if kprobe can be optimized

Chen Guokai <[email protected]> writes:

[...]

> diff --git a/arch/riscv/kernel/probes/opt.c b/arch/riscv/kernel/probes/opt.c
> index 6d23c843832e..876bec539554 100644
> --- a/arch/riscv/kernel/probes/opt.c
> +++ b/arch/riscv/kernel/probes/opt.c

[...]

> /*
> - * If two free registers can be found at the beginning of both
> - * the start and the end of replaced code, it can be optimized
> - * Also, in-function jumps need to be checked to make sure that
> - * there is no jump to the second instruction to be replaced
> + * The kprobe can be optimized when no in-function jump reaches to the
> + * instructions replaced by optimized jump instructions(AUIPC/JALR).
> */
> static bool can_optimize(unsigned long paddr, struct optimized_kprobe *op)
> {
> - return false;
> + int ret;
> + unsigned long addr, size = 0, offset = 0;
> + struct kprobe *kp = get_kprobe((kprobe_opcode_t *)paddr);
> +
> + /*
> + * Skip optimization if kprobe has been disarmed or instrumented
> + * instruction support XOI.
> + */
> + if (!kp || (riscv_probe_decode_insn(&kp->opcode, NULL) != INSN_GOOD))
> + return false;
> +
> + /*
> + * Find a instruction window large enough to contain a pair
> + * of AUIPC/JALR, and ensure each instruction in this window
> + * supports XOI.
> + */
> + ret = search_copied_insn(paddr, op);
> + if (ret)
> + return false;
> +
> + if (!kallsyms_lookup_size_offset(paddr, &size, &offset))
> + return false;
> +
> + /* Check there is enough space for relative jump(AUIPC/JALR) */
> + if (size - offset <= op->optinsn.length)
> + return false;
> +
> + /*
> + * Decode instructions until function end, check any instruction
> + * don't jump into the window used to emit optprobe(AUIPC/JALR).
> + */

Don't the fixup tables need to be checked, similar to the x86 code?


Björn

2022-11-07 17:11:11

by Björn Töpel

[permalink] [raw]
Subject: Re: [PATCH v4 0/8] Add OPTPROBES feature on RISCV

Chen Guokai <[email protected]> writes:

> From: Liao Chang <[email protected]>
>
> From: Liao Chang <[email protected]>
>
> Add jump optimization support for RISC-V.

Thanks for working on this! I have some comments on the series, but I'll
do that on a per-patch basis.

Have you run the series on real hardware, or just qemu?


Cheers,
Björn

2022-11-07 17:19:46

by Björn Töpel

[permalink] [raw]
Subject: Re: [PATCH v4 5/8] riscv/kprobe: Search free register(s) to clobber for 'AUIPC/JALR'

Chen Guokai <[email protected]> writes:

> From: Liao Chang <[email protected]>
>
> From: Liao Chang <[email protected]>
>
> This patch implement the algorithm of searching free register(s) to
> form a long-jump instruction pair.
>
> AUIPC/JALR instruction pair is introduced with a much wider jump range
> (4GB), where auipc loads the upper 20 bits to a free register and jalr
> appends the lower 12 bits to form a 32 bit immediate. Since kprobes can
> be instrumented at anywhere in kernel space, hence the free register
> should be found in a generic way, not depending on the calling convention
> or any other regulations.
>
> The algorithm for finding the free register is inspired by the register
> renaming in modern processors. From the perspective of register renaming,
> a register could be represented as two different registers if two neighbour
> instructions both write to it but no one ever reads. Extending this fact,
> a register is considered to be free if there is no read before its next
> write in the execution flow. We are free to change its value without
> interfering normal execution.
>
> In order to do jump optimization, it needs to search two free registers,
> the first one is used to form AUIPC/JALR jumping to detour buffer, the
> second one is used to form JR jumping back from detour buffer. If first
> one never been updated by any instructions replaced by 'AUIPC/JALR',
> both register supposes to the same one.
>
> Let's use the example below to explain how the algorithm work. Given
> kernel is RVI and RCV hybrid binary, and one kprobe is instrumented at
> the entry of function idle_dummy.
>
> Before Optimized Detour buffer
> <idle_dummy>: ...
> #1 add sp,sp,-16 auipc a0, #? add sp,sp,-16
> #2 sd s0,8(sp) sd s0,8(sp)
> #3 addi s0,sp,16 jalr a0, #?(a0) addi s0,sp,16
> #4 ld s0,8(sp) ld s0,8(sp)
> #5 li a0,0 li a0,0 auipc a0, #?
> #6 addi sp,sp,16 addi sp,sp,16 jr x0, #?(a0)
> #7 ret ret
>
> For regular kprobe, it is trival to replace the first instruction with
> C.EREABK, no more instruction and register will be clobber, in order to

"C.EBREAK"

> optimize kprobe with long-jump, it used to patch the first 8 bytes with
> AUIPC/JALR, and a0 will be chosen to save the address jumping to,
> because from #1 to #7, a0 is the only one register that satifies two
> conditions: (1) No read before write (2) Never been updated in detour
> buffer. While s0 has been used as the source register at #2, so it is
> not free to clobber.
>
> The searching starts from the kprobe and stop at the last instruction of
> function or the first branch/jump instruction, it decodes out the 'rs'
> and 'rd' part of each visited instruction. If the 'rd' never been read
> before, then record it to bitmask 'write'; if the 'rs' never been
> written before, then record it to another bitmask 'read'. When searching
> stops, the remaining bits of 'write' are the free registers to form
> AUIPC/JALR or JR.
>

AFAIU, the algorithm only tracks registers that are *in use*. You are
already scanning the whole function (next patch). What about the caller
saved registers that are *not* used by the function in the probe range?
Can those, potentially unused, regs be used?

> Signed-off-by: Liao Chang <[email protected]>
> Co-developed-by: Chen Guokai <[email protected]>
> Signed-off-by: Chen Guokai <[email protected]>
> ---
> arch/riscv/kernel/probes/opt.c | 225 ++++++++++++++++++++++++++++++++-
> 1 file changed, 224 insertions(+), 1 deletion(-)
>
> diff --git a/arch/riscv/kernel/probes/opt.c b/arch/riscv/kernel/probes/opt.c
> index e4a619c2077e..6d23c843832e 100644
> --- a/arch/riscv/kernel/probes/opt.c
> +++ b/arch/riscv/kernel/probes/opt.c

[...]

> +static void arch_find_register(unsigned long start, unsigned long end,

Nit; When I see "arch_" I think it's functionality that can be
overridden per-arch. This is not the case, but just a helper for RV.

[...]

> static void find_free_registers(struct kprobe *kp, struct optimized_kprobe *op,
> - int *rd1, int *rd2)
> + int *rd, int *ra)

Nit; Please get rid of this code churn, just name the parameters
correctly on introduction in the previous patch.

[...]

> + *rd = ((kw | ow) == 1UL) ? 0 : __builtin_ctzl((kw | ow) & ~1UL);
> + *ra = (kw == 1UL) ? 0 : __builtin_ctzl(kw & ~1UL);

Hmm, __builtin_ctzl is undefined for 0, right? Can that be triggered
here?


Björn

2022-11-08 12:00:23

by Xim

[permalink] [raw]
Subject: Re: [PATCH v4 0/8] Add OPTPROBES feature on RISCV

Hi Björn,

Thanks for your great review! Some explanations below.

> 2022年11月8日 00:54,Björn Töpel <[email protected]> 写道:
>
> Have you run the series on real hardware, or just qemu?

Currently only qemu tests are made, I will try to test it on a FPGA real hardware soon.

> AFAIU, the algorithm only tracks registers that are *in use*. You are
> already scanning the whole function (next patch). What about the caller
> saved registers that are *not* used by the function in the probe range?
> Can those, potentially unused, regs be used?

Great missing part! I have made a static analyzation right upon receiving this mail.
The result shows that this newly purposed idea reaches about the same
success rate on my test set (rv64 defconf with RVI only) while when combined,
they can reach a higher success rate, 1/3 above their baseline. A patch that
includes this strategy will be sent soon.
>
>> +static void arch_find_register(unsigned long start, unsigned long end,
>
> Nit; When I see "arch_" I think it's functionality that can be
> overridden per-arch. This is not the case, but just a helper for RV.

It can be explained from two aspects. First, it can be extended to most RISC
archs, which can be extracted into the common flow of Kprobe. Second, it is indeed
a internal helper for now, so I will correct the name in the next version.

>> static void find_free_registers(struct kprobe *kp, struct optimized_kprobe *op,
>> - int *rd1, int *rd2)
>> + int *rd, int *ra)
>
> Nit; Please get rid of this code churn, just name the parameters
> correctly on introduction in the previous patch.

Will be fixed.

>> + *rd = ((kw | ow) == 1UL) ? 0 : __builtin_ctzl((kw | ow) & ~1UL);
>> + *ra = (kw == 1UL) ? 0 : __builtin_ctzl(kw & ~1UL);
>
> Hmm, __builtin_ctzl is undefined for 0, right? Can that be triggered
> here?

Will be fixed.

Regards,
Guokai Chen

2022-11-08 12:09:51

by Liao, Chang

[permalink] [raw]
Subject: Re: [PATCH v4 0/8] Add OPTPROBES feature on RISCV



在 2022/11/8 19:04, Xim 写道:
> Hi Björn,
>
> Thanks for your great review! Some explanations below.
>
>> 2022年11月8日 00:54,Björn Töpel <[email protected]> 写道:
>>
>> Have you run the series on real hardware, or just qemu?
>
> Currently only qemu tests are made, I will try to test it on a FPGA real hardware soon.
>
>> AFAIU, the algorithm only tracks registers that are *in use*. You are
>> already scanning the whole function (next patch). What about the caller
>> saved registers that are *not* used by the function in the probe range?
>> Can those, potentially unused, regs be used?
>
> Great missing part! I have made a static analyzation right upon receiving this mail.
> The result shows that this newly purposed idea reaches about the same
> success rate on my test set (rv64 defconf with RVI only) while when combined,
> they can reach a higher success rate, 1/3 above their baseline. A patch that
> includes this strategy will be sent soon.
>>
>>> +static void arch_find_register(unsigned long start, unsigned long end,
>>
>> Nit; When I see "arch_" I think it's functionality that can be
>> overridden per-arch. This is not the case, but just a helper for RV.
>
> It can be explained from two aspects. First, it can be extended to most RISC
> archs, which can be extracted into the common flow of Kprobe. Second, it is indeed
> a internal helper for now, so I will correct the name in the next version.
>
>>> static void find_free_registers(struct kprobe *kp, struct optimized_kprobe *op,
>>> - int *rd1, int *rd2)
>>> + int *rd, int *ra)
>>
>> Nit; Please get rid of this code churn, just name the parameters
>> correctly on introduction in the previous patch.
>
> Will be fixed.
>
>>> + *rd = ((kw | ow) == 1UL) ? 0 : __builtin_ctzl((kw | ow) & ~1UL);
>>> + *ra = (kw == 1UL) ? 0 : __builtin_ctzl(kw & ~1UL);
>>
>> Hmm, __builtin_ctzl is undefined for 0, right? Can that be triggered
>> here?

This corner case has been taken into account, look these condition parts,
if kw == 1UL this expression will return 0 directly, no chance to invoke __builtin_ctzl.

Thanks.

>
> Will be fixed.
>
> Regards,
> Guokai Chen
>

--
BR,
Liao, Chang

2022-11-08 14:47:43

by Björn Töpel

[permalink] [raw]
Subject: Re: [PATCH v4 0/8] Add OPTPROBES feature on RISCV

"liaochang (A)" <[email protected]> writes:

>>>> + *rd = ((kw | ow) == 1UL) ? 0 : __builtin_ctzl((kw | ow) & ~1UL);
>>>> + *ra = (kw == 1UL) ? 0 : __builtin_ctzl(kw & ~1UL);
>>>
>>> Hmm, __builtin_ctzl is undefined for 0, right? Can that be triggered
>>> here?
>
> This corner case has been taken into account, look these condition parts,
> if kw == 1UL this expression will return 0 directly, no chance to invoke __builtin_ctzl.

Indeed! Thanks for making that clear! Looking forward to the next
revision!


Björn

2022-11-13 06:02:24

by kernel test robot

[permalink] [raw]
Subject: Re: [PATCH v4 4/8] riscv/kprobe: Add common RVI and RVC instruction decoder code

Hi Chen,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on linus/master]
[also build test ERROR on v6.1-rc4 next-20221111]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url: https://github.com/intel-lab-lkp/linux/commits/Chen-Guokai/Add-OPTPROBES-feature-on-RISCV/20221106-180613
patch link: https://lore.kernel.org/r/20221106100316.2803176-5-chenguokai17%40mails.ucas.ac.cn
patch subject: [PATCH v4 4/8] riscv/kprobe: Add common RVI and RVC instruction decoder code
config: riscv-randconfig-p002-20221113
compiler: riscv32-linux-gcc (GCC) 12.1.0
reproduce (this is a W=1 build):
wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
chmod +x ~/bin/make.cross
# https://github.com/intel-lab-lkp/linux/commit/0c2329bee63280ee1d9f257ed71b15e84f575344
git remote add linux-review https://github.com/intel-lab-lkp/linux
git fetch --no-tags linux-review Chen-Guokai/Add-OPTPROBES-feature-on-RISCV/20221106-180613
git checkout 0c2329bee63280ee1d9f257ed71b15e84f575344
# save the config file
mkdir build_dir && cp config build_dir/.config
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-12.1.0 make.cross W=1 O=build_dir ARCH=riscv SHELL=/bin/bash arch/riscv/

If you fix the issue, kindly add following tag where applicable
| Reported-by: kernel test robot <[email protected]>

All errors (new ones prefixed by >>):

In file included from arch/riscv/kernel/probes/uprobes.c:7:
>> arch/riscv/kernel/probes/decode-insn.h:19:27: error: unknown type name 'kprobe_opcode_t'; did you mean 'uprobe_opcode_t'?
19 | static inline u16 rvi_rs1(kprobe_opcode_t opcode)
| ^~~~~~~~~~~~~~~
| uprobe_opcode_t
arch/riscv/kernel/probes/decode-insn.h:24:27: error: unknown type name 'kprobe_opcode_t'; did you mean 'uprobe_opcode_t'?
24 | static inline u16 rvi_rs2(kprobe_opcode_t opcode)
| ^~~~~~~~~~~~~~~
| uprobe_opcode_t
arch/riscv/kernel/probes/decode-insn.h:29:26: error: unknown type name 'kprobe_opcode_t'; did you mean 'uprobe_opcode_t'?
29 | static inline u16 rvi_rd(kprobe_opcode_t opcode)
| ^~~~~~~~~~~~~~~
| uprobe_opcode_t
arch/riscv/kernel/probes/decode-insn.h:34:35: error: unknown type name 'kprobe_opcode_t'; did you mean 'uprobe_opcode_t'?
34 | static inline s32 rvi_branch_imme(kprobe_opcode_t opcode)
| ^~~~~~~~~~~~~~~
| uprobe_opcode_t
arch/riscv/kernel/probes/decode-insn.h:46:32: error: unknown type name 'kprobe_opcode_t'; did you mean 'uprobe_opcode_t'?
46 | static inline s32 rvi_jal_imme(kprobe_opcode_t opcode)
| ^~~~~~~~~~~~~~~
| uprobe_opcode_t
arch/riscv/kernel/probes/decode-insn.h:59:29: error: unknown type name 'kprobe_opcode_t'; did you mean 'uprobe_opcode_t'?
59 | static inline u16 rvc_r_rs1(kprobe_opcode_t opcode)
| ^~~~~~~~~~~~~~~
| uprobe_opcode_t
arch/riscv/kernel/probes/decode-insn.h:64:29: error: unknown type name 'kprobe_opcode_t'; did you mean 'uprobe_opcode_t'?
64 | static inline u16 rvc_r_rs2(kprobe_opcode_t opcode)
| ^~~~~~~~~~~~~~~
| uprobe_opcode_t
arch/riscv/kernel/probes/decode-insn.h:69:28: error: unknown type name 'kprobe_opcode_t'; did you mean 'uprobe_opcode_t'?
69 | static inline u16 rvc_r_rd(kprobe_opcode_t opcode)
| ^~~~~~~~~~~~~~~
| uprobe_opcode_t
arch/riscv/kernel/probes/decode-insn.h:74:29: error: unknown type name 'kprobe_opcode_t'; did you mean 'uprobe_opcode_t'?
74 | static inline u16 rvc_i_rs1(kprobe_opcode_t opcode)
| ^~~~~~~~~~~~~~~
| uprobe_opcode_t
arch/riscv/kernel/probes/decode-insn.h:79:28: error: unknown type name 'kprobe_opcode_t'; did you mean 'uprobe_opcode_t'?
79 | static inline u16 rvc_i_rd(kprobe_opcode_t opcode)
| ^~~~~~~~~~~~~~~
| uprobe_opcode_t
arch/riscv/kernel/probes/decode-insn.h:84:30: error: unknown type name 'kprobe_opcode_t'; did you mean 'uprobe_opcode_t'?
84 | static inline u16 rvc_ss_rs2(kprobe_opcode_t opcode)
| ^~~~~~~~~~~~~~~
| uprobe_opcode_t
arch/riscv/kernel/probes/decode-insn.h:89:28: error: unknown type name 'kprobe_opcode_t'; did you mean 'uprobe_opcode_t'?
89 | static inline u16 rvc_l_rd(kprobe_opcode_t opcode)
| ^~~~~~~~~~~~~~~
| uprobe_opcode_t
arch/riscv/kernel/probes/decode-insn.h:94:28: error: unknown type name 'kprobe_opcode_t'; did you mean 'uprobe_opcode_t'?
94 | static inline u16 rvc_l_rs(kprobe_opcode_t opcode)
| ^~~~~~~~~~~~~~~
| uprobe_opcode_t
arch/riscv/kernel/probes/decode-insn.h:99:29: error: unknown type name 'kprobe_opcode_t'; did you mean 'uprobe_opcode_t'?
99 | static inline u16 rvc_s_rs2(kprobe_opcode_t opcode)
| ^~~~~~~~~~~~~~~
| uprobe_opcode_t
arch/riscv/kernel/probes/decode-insn.h:104:29: error: unknown type name 'kprobe_opcode_t'; did you mean 'uprobe_opcode_t'?
104 | static inline u16 rvc_s_rs1(kprobe_opcode_t opcode)
| ^~~~~~~~~~~~~~~
| uprobe_opcode_t
arch/riscv/kernel/probes/decode-insn.h:109:29: error: unknown type name 'kprobe_opcode_t'; did you mean 'uprobe_opcode_t'?
109 | static inline u16 rvc_a_rs2(kprobe_opcode_t opcode)
| ^~~~~~~~~~~~~~~
| uprobe_opcode_t
arch/riscv/kernel/probes/decode-insn.h:114:29: error: unknown type name 'kprobe_opcode_t'; did you mean 'uprobe_opcode_t'?
114 | static inline u16 rvc_a_rs1(kprobe_opcode_t opcode)
| ^~~~~~~~~~~~~~~
| uprobe_opcode_t
arch/riscv/kernel/probes/decode-insn.h:119:28: error: unknown type name 'kprobe_opcode_t'; did you mean 'uprobe_opcode_t'?
119 | static inline u16 rvc_a_rd(kprobe_opcode_t opcode)
| ^~~~~~~~~~~~~~~
| uprobe_opcode_t
arch/riscv/kernel/probes/decode-insn.h:124:28: error: unknown type name 'kprobe_opcode_t'; did you mean 'uprobe_opcode_t'?
124 | static inline u16 rvc_b_rd(kprobe_opcode_t opcode)
| ^~~~~~~~~~~~~~~
| uprobe_opcode_t
arch/riscv/kernel/probes/decode-insn.h:129:28: error: unknown type name 'kprobe_opcode_t'; did you mean 'uprobe_opcode_t'?
129 | static inline u16 rvc_b_rs(kprobe_opcode_t opcode)
| ^~~~~~~~~~~~~~~
| uprobe_opcode_t
arch/riscv/kernel/probes/decode-insn.h:134:35: error: unknown type name 'kprobe_opcode_t'; did you mean 'uprobe_opcode_t'?
134 | static inline s32 rvc_branch_imme(kprobe_opcode_t opcode)
| ^~~~~~~~~~~~~~~
| uprobe_opcode_t
arch/riscv/kernel/probes/decode-insn.h:147:32: error: unknown type name 'kprobe_opcode_t'; did you mean 'uprobe_opcode_t'?
147 | static inline s32 rvc_jal_imme(kprobe_opcode_t opcode)
| ^~~~~~~~~~~~~~~
| uprobe_opcode_t


vim +19 arch/riscv/kernel/probes/decode-insn.h

18
> 19 static inline u16 rvi_rs1(kprobe_opcode_t opcode)
20 {
21 return (u16)((opcode >> 15) & 0x1f);
22 }
23

--
0-DAY CI Kernel Test Service
https://01.org/lkp


Attachments:
(No filename) (8.43 kB)
config (156.18 kB)
Download all attachments