2022-05-09 10:12:45

by Christophe Leroy

[permalink] [raw]
Subject: [PATCH v3 00/25] powerpc: ftrace optimisation and cleanup and more [v3]

This series provides optimisation and cleanup of ftrace on powerpc.

With this series ftrace activation is about 20% faster on an 8xx.

At the end of the series come additional cleanups around ppc-opcode,
that would likely conflict with this series if posted separately.

Change since v2:
- The only change in v3 is in patch 21, to fix sparse problems reported by the Robot.

Main changes since v1 (details in after each individual patch description):
- Added 3 patches (8, 9, 10) that convert PPC64_ELF_ABI_v{1/2} macros by CONFIG_PPC64_ELF_ABI_V{1/2}
- Taken comments from Naveen

Christophe Leroy (25):
powerpc/ftrace: Refactor prepare_ftrace_return()
powerpc/ftrace: Remove redundant create_branch() calls
powerpc/code-patching: Inline is_offset_in_{cond}_branch_range()
powerpc/ftrace: Use is_offset_in_branch_range()
powerpc/code-patching: Inline create_branch()
powerpc/ftrace: Inline ftrace_modify_code()
powerpc/ftrace: Use patch_instruction() return directly
powerpc: Add CONFIG_PPC64_ELF_ABI_V1 and CONFIG_PPC64_ELF_ABI_V2
powerpc: Replace PPC64_ELF_ABI_v{1/2} by CONFIG_PPC64_ELF_ABI_V{1/2}
powerpc: Finalise cleanup around ABI use
powerpc/ftrace: Make __ftrace_make_{nop/call}() common to PPC32 and
PPC64
powerpc/ftrace: Don't include ftrace.o for CONFIG_FTRACE_SYSCALLS
powerpc/ftrace: Use CONFIG_FUNCTION_TRACER instead of
CONFIG_DYNAMIC_FTRACE
powerpc/ftrace: Remove ftrace_plt_tramps[]
powerpc/ftrace: Use BRANCH_SET_LINK instead of value 1
powerpc/ftrace: Use PPC_RAW_xxx() macros instead of opencoding.
powerpc/ftrace: Use size macro instead of opencoding
powerpc/ftrace: Simplify expected_nop_sequence()
powerpc/ftrace: Minimise number of #ifdefs
powerpc/inst: Add __copy_inst_from_kernel_nofault()
powerpc/ftrace: Don't use copy_from_kernel_nofault() in
module_trampoline_target()
powerpc/inst: Remove PPC_INST_BRANCH
powerpc/modules: Use PPC_LI macros instead of opencoding
powerpc/inst: Remove PPC_INST_BL
powerpc/opcodes: Remove unused PPC_INST_XXX macros

arch/powerpc/Kconfig | 2 +-
arch/powerpc/Makefile | 12 +-
arch/powerpc/boot/Makefile | 2 +
arch/powerpc/include/asm/code-patching.h | 65 +++-
arch/powerpc/include/asm/ftrace.h | 4 +-
arch/powerpc/include/asm/inst.h | 13 +-
arch/powerpc/include/asm/linkage.h | 2 +-
arch/powerpc/include/asm/module.h | 2 -
arch/powerpc/include/asm/ppc-opcode.h | 22 +-
arch/powerpc/include/asm/ppc_asm.h | 4 +-
arch/powerpc/include/asm/ptrace.h | 2 +-
arch/powerpc/include/asm/sections.h | 24 +-
arch/powerpc/include/asm/types.h | 8 -
arch/powerpc/kernel/fadump.c | 13 +-
arch/powerpc/kernel/head_64.S | 2 +-
arch/powerpc/kernel/interrupt_64.S | 2 +-
arch/powerpc/kernel/kprobes.c | 6 +-
arch/powerpc/kernel/misc_64.S | 2 +-
arch/powerpc/kernel/module.c | 4 +-
arch/powerpc/kernel/module_32.c | 38 ++-
arch/powerpc/kernel/module_64.c | 7 +-
arch/powerpc/kernel/ptrace/ptrace.c | 6 -
arch/powerpc/kernel/trace/Makefile | 5 +-
arch/powerpc/kernel/trace/ftrace.c | 375 +++++++----------------
arch/powerpc/kvm/book3s_interrupts.S | 2 +-
arch/powerpc/kvm/book3s_rmhandlers.S | 2 +-
arch/powerpc/lib/code-patching.c | 49 +--
arch/powerpc/lib/feature-fixups.c | 2 +-
arch/powerpc/net/bpf_jit.h | 4 +-
arch/powerpc/net/bpf_jit_comp.c | 2 +-
arch/powerpc/net/bpf_jit_comp64.c | 4 +-
arch/powerpc/platforms/Kconfig.cputype | 6 +
32 files changed, 271 insertions(+), 422 deletions(-)

--
2.35.1



2022-05-09 10:12:58

by Christophe Leroy

[permalink] [raw]
Subject: [PATCH v3 10/25] powerpc: Finalise cleanup around ABI use

Now that we have CONFIG_PPC64_ELF_ABI_V1 and CONFIG_PPC64_ELF_ABI_V2,
get rid of all indirect detection of ABI version.

Signed-off-by: Christophe Leroy <[email protected]>
---
arch/powerpc/Kconfig | 2 +-
arch/powerpc/Makefile | 2 +-
arch/powerpc/include/asm/types.h | 8 --------
arch/powerpc/kernel/fadump.c | 13 ++++++++-----
arch/powerpc/kernel/ptrace/ptrace.c | 6 ------
arch/powerpc/net/bpf_jit_comp64.c | 4 ++--
6 files changed, 12 insertions(+), 23 deletions(-)

diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index 174edabb74fa..5514fed3f072 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -208,7 +208,7 @@ config PPC
select HAVE_EFFICIENT_UNALIGNED_ACCESS if !(CPU_LITTLE_ENDIAN && POWER7_CPU)
select HAVE_FAST_GUP
select HAVE_FTRACE_MCOUNT_RECORD
- select HAVE_FUNCTION_DESCRIPTORS if PPC64 && !CPU_LITTLE_ENDIAN
+ select HAVE_FUNCTION_DESCRIPTORS if PPC64_ELF_ABI_V1
select HAVE_FUNCTION_ERROR_INJECTION
select HAVE_FUNCTION_GRAPH_TRACER
select HAVE_FUNCTION_TRACER
diff --git a/arch/powerpc/Makefile b/arch/powerpc/Makefile
index 1ba98be84101..8bd3b631f094 100644
--- a/arch/powerpc/Makefile
+++ b/arch/powerpc/Makefile
@@ -213,7 +213,7 @@ CHECKFLAGS += -m$(BITS) -D__powerpc__ -D__powerpc$(BITS)__
ifdef CONFIG_CPU_BIG_ENDIAN
CHECKFLAGS += -D__BIG_ENDIAN__
else
-CHECKFLAGS += -D__LITTLE_ENDIAN__ -D_CALL_ELF=2
+CHECKFLAGS += -D__LITTLE_ENDIAN__
endif

ifdef CONFIG_476FPE_ERR46
diff --git a/arch/powerpc/include/asm/types.h b/arch/powerpc/include/asm/types.h
index 84078c28c1a2..93157a661dcc 100644
--- a/arch/powerpc/include/asm/types.h
+++ b/arch/powerpc/include/asm/types.h
@@ -11,14 +11,6 @@

#include <uapi/asm/types.h>

-#ifdef __powerpc64__
-#if defined(_CALL_ELF) && _CALL_ELF == 2
-#define PPC64_ELF_ABI_v2 1
-#else
-#define PPC64_ELF_ABI_v1 1
-#endif
-#endif /* __powerpc64__ */
-
#ifndef __ASSEMBLY__

typedef __vector128 vector128;
diff --git a/arch/powerpc/kernel/fadump.c b/arch/powerpc/kernel/fadump.c
index 65562c4a0a69..5f7224d66586 100644
--- a/arch/powerpc/kernel/fadump.c
+++ b/arch/powerpc/kernel/fadump.c
@@ -968,11 +968,14 @@ static int fadump_init_elfcore_header(char *bufp)
elf->e_entry = 0;
elf->e_phoff = sizeof(struct elfhdr);
elf->e_shoff = 0;
-#if defined(_CALL_ELF)
- elf->e_flags = _CALL_ELF;
-#else
- elf->e_flags = 0;
-#endif
+
+ if (IS_ENABLED(CONFIG_PPC64_ELF_ABI_V2))
+ elf->e_flags = 2;
+ else if (IS_ENABLED(CONFIG_PPC64_ELF_ABI_V1))
+ elf->e_flags = 1;
+ else
+ elf->e_flags = 0;
+
elf->e_ehsize = sizeof(struct elfhdr);
elf->e_phentsize = sizeof(struct elf_phdr);
elf->e_phnum = 0;
diff --git a/arch/powerpc/kernel/ptrace/ptrace.c b/arch/powerpc/kernel/ptrace/ptrace.c
index 9fbe155a9bd0..4d2dc22d4a2d 100644
--- a/arch/powerpc/kernel/ptrace/ptrace.c
+++ b/arch/powerpc/kernel/ptrace/ptrace.c
@@ -444,10 +444,4 @@ void __init pt_regs_check(void)
* real registers.
*/
BUILD_BUG_ON(PT_DSCR < sizeof(struct user_pt_regs) / sizeof(unsigned long));
-
-#ifdef CONFIG_PPC64_ELF_ABI_V1
- BUILD_BUG_ON(!IS_ENABLED(CONFIG_HAVE_FUNCTION_DESCRIPTORS));
-#else
- BUILD_BUG_ON(IS_ENABLED(CONFIG_HAVE_FUNCTION_DESCRIPTORS));
-#endif
}
diff --git a/arch/powerpc/net/bpf_jit_comp64.c b/arch/powerpc/net/bpf_jit_comp64.c
index d7b42f45669e..594c54931e20 100644
--- a/arch/powerpc/net/bpf_jit_comp64.c
+++ b/arch/powerpc/net/bpf_jit_comp64.c
@@ -126,7 +126,7 @@ void bpf_jit_build_prologue(u32 *image, struct codegen_context *ctx)
{
int i;

- if (__is_defined(CONFIG_PPC64_ELF_ABI_V2))
+ if (IS_ENABLED(CONFIG_PPC64_ELF_ABI_V2))
EMIT(PPC_RAW_LD(_R2, _R13, offsetof(struct paca_struct, kernel_toc)));

/*
@@ -266,7 +266,7 @@ static int bpf_jit_emit_tail_call(u32 *image, struct codegen_context *ctx, u32 o
int b2p_index = bpf_to_ppc(BPF_REG_3);
int bpf_tailcall_prologue_size = 8;

- if (__is_defined(CONFIG_PPC64_ELF_ABI_V2))
+ if (IS_ENABLED(CONFIG_PPC64_ELF_ABI_V2))
bpf_tailcall_prologue_size += 4; /* skip past the toc load */

/*
--
2.35.1


2022-05-09 10:13:00

by Christophe Leroy

[permalink] [raw]
Subject: [PATCH v3 22/25] powerpc/inst: Remove PPC_INST_BRANCH

Convert last users of PPC_INST_BRANCH to PPC_RAW_BRANCH()

And remove PPC_INST_BRANCH.

Signed-off-by: Christophe Leroy <[email protected]>
---
arch/powerpc/include/asm/ppc-opcode.h | 3 +--
arch/powerpc/lib/feature-fixups.c | 2 +-
2 files changed, 2 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/include/asm/ppc-opcode.h b/arch/powerpc/include/asm/ppc-opcode.h
index 3e9aa96ae74b..1871a86c5436 100644
--- a/arch/powerpc/include/asm/ppc-opcode.h
+++ b/arch/powerpc/include/asm/ppc-opcode.h
@@ -290,7 +290,6 @@
#define PPC_INST_ADDIS 0x3c000000
#define PPC_INST_ADD 0x7c000214
#define PPC_INST_DIVD 0x7c0003d2
-#define PPC_INST_BRANCH 0x48000000
#define PPC_INST_BL 0x48000001
#define PPC_INST_BRANCH_COND 0x40800000

@@ -575,7 +574,7 @@
#define PPC_RAW_MTSPR(spr, d) (0x7c0003a6 | ___PPC_RS(d) | __PPC_SPR(spr))
#define PPC_RAW_EIEIO() (0x7c0006ac)

-#define PPC_RAW_BRANCH(addr) (PPC_INST_BRANCH | ((addr) & 0x03fffffc))
+#define PPC_RAW_BRANCH(offset) (0x48000000 | PPC_LI(offset))
#define PPC_RAW_BL(offset) (0x48000001 | PPC_LI(offset))

/* Deal with instructions that older assemblers aren't aware of */
diff --git a/arch/powerpc/lib/feature-fixups.c b/arch/powerpc/lib/feature-fixups.c
index 343a78826035..993d3f31832a 100644
--- a/arch/powerpc/lib/feature-fixups.c
+++ b/arch/powerpc/lib/feature-fixups.c
@@ -451,7 +451,7 @@ static int __do_rfi_flush_fixups(void *data)

if (types & L1D_FLUSH_FALLBACK)
/* b .+16 to fallback flush */
- instrs[0] = PPC_INST_BRANCH | 16;
+ instrs[0] = PPC_RAW_BRANCH(16);

i = 0;
if (types & L1D_FLUSH_ORI) {
--
2.35.1


2022-05-09 10:13:04

by Christophe Leroy

[permalink] [raw]
Subject: [PATCH v3 18/25] powerpc/ftrace: Simplify expected_nop_sequence()

Avoid ifdefs around expected_nop_sequence().

While at it make it a bool.

Signed-off-by: Christophe Leroy <[email protected]>
---
arch/powerpc/kernel/trace/ftrace.c | 22 ++++++----------------
1 file changed, 6 insertions(+), 16 deletions(-)

diff --git a/arch/powerpc/kernel/trace/ftrace.c b/arch/powerpc/kernel/trace/ftrace.c
index 346b5485e7ef..c34cb394f8a8 100644
--- a/arch/powerpc/kernel/trace/ftrace.c
+++ b/arch/powerpc/kernel/trace/ftrace.c
@@ -390,24 +390,14 @@ int ftrace_make_nop(struct module *mod,
* They should effectively be a NOP, and follow formal constraints,
* depending on the ABI. Return false if they don't.
*/
-#ifdef CONFIG_PPC64_ELF_ABI_V1
-static int
-expected_nop_sequence(void *ip, ppc_inst_t op0, ppc_inst_t op1)
-{
- if (!ppc_inst_equal(op0, ppc_inst(PPC_RAW_BRANCH(8))) ||
- !ppc_inst_equal(op1, ppc_inst(PPC_INST_LD_TOC)))
- return 0;
- return 1;
-}
-#else
-static int
-expected_nop_sequence(void *ip, ppc_inst_t op0, ppc_inst_t op1)
+static bool expected_nop_sequence(void *ip, ppc_inst_t op0, ppc_inst_t op1)
{
- if (!ppc_inst_equal(op0, ppc_inst(PPC_RAW_NOP())))
- return 0;
- return 1;
+ if (IS_ENABLED(CONFIG_PPC64_ELF_ABI_V1))
+ return ppc_inst_equal(op0, ppc_inst(PPC_RAW_BRANCH(8))) &&
+ ppc_inst_equal(op1, ppc_inst(PPC_INST_LD_TOC));
+ else
+ return ppc_inst_equal(op0, ppc_inst(PPC_RAW_NOP()));
}
-#endif

static int
__ftrace_make_call(struct dyn_ftrace *rec, unsigned long addr)
--
2.35.1


2022-05-09 10:14:50

by Christophe Leroy

[permalink] [raw]
Subject: [PATCH v3 25/25] powerpc/opcodes: Remove unused PPC_INST_XXX macros

The following PPC_INST_XXX macros are not used anymore
outside ppc-opcode.h:
- PPC_INST_LD
- PPC_INST_STD
- PPC_INST_ADDIS
- PPC_INST_ADD
- PPC_INST_DIVD

Remove them.

Signed-off-by: Christophe Leroy <[email protected]>
---
arch/powerpc/include/asm/ppc-opcode.h | 13 ++++---------
1 file changed, 4 insertions(+), 9 deletions(-)

diff --git a/arch/powerpc/include/asm/ppc-opcode.h b/arch/powerpc/include/asm/ppc-opcode.h
index 9ca8996ee1cd..b9d6f95b66e9 100644
--- a/arch/powerpc/include/asm/ppc-opcode.h
+++ b/arch/powerpc/include/asm/ppc-opcode.h
@@ -285,11 +285,6 @@
#define PPC_INST_TRECHKPT 0x7c0007dd
#define PPC_INST_TRECLAIM 0x7c00075d
#define PPC_INST_TSR 0x7c0005dd
-#define PPC_INST_LD 0xe8000000
-#define PPC_INST_STD 0xf8000000
-#define PPC_INST_ADDIS 0x3c000000
-#define PPC_INST_ADD 0x7c000214
-#define PPC_INST_DIVD 0x7c0003d2
#define PPC_INST_BRANCH_COND 0x40800000

/* Prefixes */
@@ -462,10 +457,10 @@
(0x100000c7 | ___PPC_RT(vrt) | ___PPC_RA(vra) | ___PPC_RB(vrb) | __PPC_RC21)
#define PPC_RAW_VCMPEQUB_RC(vrt, vra, vrb) \
(0x10000006 | ___PPC_RT(vrt) | ___PPC_RA(vra) | ___PPC_RB(vrb) | __PPC_RC21)
-#define PPC_RAW_LD(r, base, i) (PPC_INST_LD | ___PPC_RT(r) | ___PPC_RA(base) | IMM_DS(i))
+#define PPC_RAW_LD(r, base, i) (0xe8000000 | ___PPC_RT(r) | ___PPC_RA(base) | IMM_DS(i))
#define PPC_RAW_LWZ(r, base, i) (0x80000000 | ___PPC_RT(r) | ___PPC_RA(base) | IMM_L(i))
#define PPC_RAW_LWZX(t, a, b) (0x7c00002e | ___PPC_RT(t) | ___PPC_RA(a) | ___PPC_RB(b))
-#define PPC_RAW_STD(r, base, i) (PPC_INST_STD | ___PPC_RS(r) | ___PPC_RA(base) | IMM_DS(i))
+#define PPC_RAW_STD(r, base, i) (0xf8000000 | ___PPC_RS(r) | ___PPC_RA(base) | IMM_DS(i))
#define PPC_RAW_STDCX(s, a, b) (0x7c0001ad | ___PPC_RS(s) | ___PPC_RA(a) | ___PPC_RB(b))
#define PPC_RAW_LFSX(t, a, b) (0x7c00042e | ___PPC_RT(t) | ___PPC_RA(a) | ___PPC_RB(b))
#define PPC_RAW_STFSX(s, a, b) (0x7c00052e | ___PPC_RS(s) | ___PPC_RA(a) | ___PPC_RB(b))
@@ -476,8 +471,8 @@
#define PPC_RAW_ADDE(t, a, b) (0x7c000114 | ___PPC_RT(t) | ___PPC_RA(a) | ___PPC_RB(b))
#define PPC_RAW_ADDZE(t, a) (0x7c000194 | ___PPC_RT(t) | ___PPC_RA(a))
#define PPC_RAW_ADDME(t, a) (0x7c0001d4 | ___PPC_RT(t) | ___PPC_RA(a))
-#define PPC_RAW_ADD(t, a, b) (PPC_INST_ADD | ___PPC_RT(t) | ___PPC_RA(a) | ___PPC_RB(b))
-#define PPC_RAW_ADD_DOT(t, a, b) (PPC_INST_ADD | ___PPC_RT(t) | ___PPC_RA(a) | ___PPC_RB(b) | 0x1)
+#define PPC_RAW_ADD(t, a, b) (0x7c000214 | ___PPC_RT(t) | ___PPC_RA(a) | ___PPC_RB(b))
+#define PPC_RAW_ADD_DOT(t, a, b) (0x7c000214 | ___PPC_RT(t) | ___PPC_RA(a) | ___PPC_RB(b) | 0x1)
#define PPC_RAW_ADDC(t, a, b) (0x7c000014 | ___PPC_RT(t) | ___PPC_RA(a) | ___PPC_RB(b))
#define PPC_RAW_ADDC_DOT(t, a, b) (0x7c000014 | ___PPC_RT(t) | ___PPC_RA(a) | ___PPC_RB(b) | 0x1)
#define PPC_RAW_NOP() PPC_RAW_ORI(0, 0, 0)
--
2.35.1


2022-05-09 10:16:14

by Christophe Leroy

[permalink] [raw]
Subject: [PATCH v3 20/25] powerpc/inst: Add __copy_inst_from_kernel_nofault()

On the same model as get_user() versus __get_user(),
introduce __copy_inst_from_kernel_nofault() which doesn't
check address.

To be used by callers that have already checked that the adress
is a kernel address.

Signed-off-by: Christophe Leroy <[email protected]>
---
arch/powerpc/include/asm/inst.h | 13 +++++++++----
1 file changed, 9 insertions(+), 4 deletions(-)

diff --git a/arch/powerpc/include/asm/inst.h b/arch/powerpc/include/asm/inst.h
index 80b6d74146c6..b49aae9f6f27 100644
--- a/arch/powerpc/include/asm/inst.h
+++ b/arch/powerpc/include/asm/inst.h
@@ -158,13 +158,10 @@ static inline char *__ppc_inst_as_str(char str[PPC_INST_STR_LEN], ppc_inst_t x)
__str; \
})

-static inline int copy_inst_from_kernel_nofault(ppc_inst_t *inst, u32 *src)
+static inline int __copy_inst_from_kernel_nofault(ppc_inst_t *inst, u32 *src)
{
unsigned int val, suffix;

- if (unlikely(!is_kernel_addr((unsigned long)src)))
- return -ERANGE;
-
/* See https://github.com/ClangBuiltLinux/linux/issues/1521 */
#if defined(CONFIG_CC_IS_CLANG) && CONFIG_CLANG_VERSION < 140000
val = suffix = 0;
@@ -181,4 +178,12 @@ static inline int copy_inst_from_kernel_nofault(ppc_inst_t *inst, u32 *src)
return -EFAULT;
}

+static inline int copy_inst_from_kernel_nofault(ppc_inst_t *inst, u32 *src)
+{
+ if (unlikely(!is_kernel_addr((unsigned long)src)))
+ return -ERANGE;
+
+ return __copy_inst_from_kernel_nofault(inst, src);
+}
+
#endif /* _ASM_POWERPC_INST_H */
--
2.35.1


2022-05-09 10:16:25

by Christophe Leroy

[permalink] [raw]
Subject: [PATCH v3 21/25] powerpc/ftrace: Don't use copy_from_kernel_nofault() in module_trampoline_target()

module_trampoline_target() is quite a hot path used when
activating/deactivating function tracer.

Avoid the heavy copy_from_kernel_nofault() by doing four calls
to copy_inst_from_kernel_nofault().

Use __copy_inst_from_kernel_nofault() for the 3 last calls. First call
is done to copy_from_kernel_nofault() to check address is within
kernel space. No risk to wrap out the top of kernel space because the
last page is never mapped so if address is in last page the first copy
will fails and the other ones will never be performed.

And also make it notrace just like all functions that call it.

Signed-off-by: Christophe Leroy <[email protected]>
---
v3: Use ppc_inst_t to fix sparse warnings and split trampoline verification in one line per instruction.
---
arch/powerpc/kernel/module_32.c | 27 ++++++++++++++++++---------
1 file changed, 18 insertions(+), 9 deletions(-)

diff --git a/arch/powerpc/kernel/module_32.c b/arch/powerpc/kernel/module_32.c
index a0432ef46967..715a42f383d0 100644
--- a/arch/powerpc/kernel/module_32.c
+++ b/arch/powerpc/kernel/module_32.c
@@ -289,23 +289,32 @@ int apply_relocate_add(Elf32_Shdr *sechdrs,
}

#ifdef CONFIG_DYNAMIC_FTRACE
-int module_trampoline_target(struct module *mod, unsigned long addr,
- unsigned long *target)
+notrace int module_trampoline_target(struct module *mod, unsigned long addr,
+ unsigned long *target)
{
- unsigned int jmp[4];
+ ppc_inst_t jmp[4];

/* Find where the trampoline jumps to */
- if (copy_from_kernel_nofault(jmp, (void *)addr, sizeof(jmp)))
+ if (copy_inst_from_kernel_nofault(jmp, (void *)addr))
+ return -EFAULT;
+ if (__copy_inst_from_kernel_nofault(jmp + 1, (void *)addr + 4))
+ return -EFAULT;
+ if (__copy_inst_from_kernel_nofault(jmp + 2, (void *)addr + 8))
+ return -EFAULT;
+ if (__copy_inst_from_kernel_nofault(jmp + 3, (void *)addr + 12))
return -EFAULT;

/* verify that this is what we expect it to be */
- if ((jmp[0] & 0xffff0000) != PPC_RAW_LIS(_R12, 0) ||
- (jmp[1] & 0xffff0000) != PPC_RAW_ADDI(_R12, _R12, 0) ||
- jmp[2] != PPC_RAW_MTCTR(_R12) ||
- jmp[3] != PPC_RAW_BCTR())
+ if ((ppc_inst_val(jmp[0]) & 0xffff0000) != PPC_RAW_LIS(_R12, 0))
+ return -EINVAL;
+ if ((ppc_inst_val(jmp[1]) & 0xffff0000) != PPC_RAW_ADDI(_R12, _R12, 0))
+ return -EINVAL;
+ if (ppc_inst_val(jmp[2]) != PPC_RAW_MTCTR(_R12))
+ return -EINVAL;
+ if (ppc_inst_val(jmp[3]) != PPC_RAW_BCTR())
return -EINVAL;

- addr = (jmp[1] & 0xffff) | ((jmp[0] & 0xffff) << 16);
+ addr = (ppc_inst_val(jmp[1]) & 0xffff) | ((ppc_inst_val(jmp[0]) & 0xffff) << 16);
if (addr & 0x8000)
addr -= 0x10000;

--
2.35.1


2022-05-09 10:17:10

by Christophe Leroy

[permalink] [raw]
Subject: [PATCH v3 01/25] powerpc/ftrace: Refactor prepare_ftrace_return()

When we have CONFIG_DYNAMIC_FTRACE_WITH_ARGS,
prepare_ftrace_return() is called by ftrace_graph_func()
otherwise prepare_ftrace_return() is called from assembly.

Refactor prepare_ftrace_return() into a static
__prepare_ftrace_return() that will be called by both
prepare_ftrace_return() and ftrace_graph_func().

It will allow GCC to fold __prepare_ftrace_return() inside
ftrace_graph_func().

Signed-off-by: Christophe Leroy <[email protected]>
---
arch/powerpc/kernel/trace/ftrace.c | 12 +++++++++---
1 file changed, 9 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/kernel/trace/ftrace.c b/arch/powerpc/kernel/trace/ftrace.c
index 4ee04aacf9f1..7a266fd469b7 100644
--- a/arch/powerpc/kernel/trace/ftrace.c
+++ b/arch/powerpc/kernel/trace/ftrace.c
@@ -939,8 +939,8 @@ int ftrace_disable_ftrace_graph_caller(void)
* Hook the return address and push it in the stack of return addrs
* in current thread info. Return the address we want to divert to.
*/
-unsigned long prepare_ftrace_return(unsigned long parent, unsigned long ip,
- unsigned long sp)
+static unsigned long
+__prepare_ftrace_return(unsigned long parent, unsigned long ip, unsigned long sp)
{
unsigned long return_hooker;
int bit;
@@ -969,7 +969,13 @@ unsigned long prepare_ftrace_return(unsigned long parent, unsigned long ip,
void ftrace_graph_func(unsigned long ip, unsigned long parent_ip,
struct ftrace_ops *op, struct ftrace_regs *fregs)
{
- fregs->regs.link = prepare_ftrace_return(parent_ip, ip, fregs->regs.gpr[1]);
+ fregs->regs.link = __prepare_ftrace_return(parent_ip, ip, fregs->regs.gpr[1]);
+}
+#else
+unsigned long prepare_ftrace_return(unsigned long parent, unsigned long ip,
+ unsigned long sp)
+{
+ return __prepare_ftrace_return(parent, ip, sp);
}
#endif
#endif /* CONFIG_FUNCTION_GRAPH_TRACER */
--
2.35.1


2022-05-09 10:17:42

by Christophe Leroy

[permalink] [raw]
Subject: [PATCH v3 15/25] powerpc/ftrace: Use BRANCH_SET_LINK instead of value 1

To make it explicit, use BRANCH_SET_LINK instead of value 1
when calling create_branch().

Signed-off-by: Christophe Leroy <[email protected]>
---
arch/powerpc/kernel/trace/ftrace.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/kernel/trace/ftrace.c b/arch/powerpc/kernel/trace/ftrace.c
index 010a8c7ff4ac..c4a68340a351 100644
--- a/arch/powerpc/kernel/trace/ftrace.c
+++ b/arch/powerpc/kernel/trace/ftrace.c
@@ -45,7 +45,7 @@ ftrace_call_replace(unsigned long ip, unsigned long addr, int link)
addr = ppc_function_entry((void *)addr);

/* if (link) set op to 'bl' else 'b' */
- create_branch(&op, (u32 *)ip, addr, link ? 1 : 0);
+ create_branch(&op, (u32 *)ip, addr, link ? BRANCH_SET_LINK : 0);

return op;
}
--
2.35.1


2022-05-09 10:17:47

by Christophe Leroy

[permalink] [raw]
Subject: [PATCH v3 06/25] powerpc/ftrace: Inline ftrace_modify_code()

Inlining ftrace_modify_code(), it increases a bit the
size of ftrace code but brings 5% improvment on ftrace
activation.

Usually in C files we let gcc decide what to do but here
it really help to 'help' gcc to decide to inline, thought
we don't want to force it with an __always_inline that
would be too much for CONFIG_CC_OPTIMIZE_FOR_SIZE.

Signed-off-by: Christophe Leroy <[email protected]>
---
v2: More explanation in commit message
---
arch/powerpc/kernel/trace/ftrace.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/kernel/trace/ftrace.c b/arch/powerpc/kernel/trace/ftrace.c
index 41c45b9c7f39..98e82fa4980f 100644
--- a/arch/powerpc/kernel/trace/ftrace.c
+++ b/arch/powerpc/kernel/trace/ftrace.c
@@ -53,7 +53,7 @@ ftrace_call_replace(unsigned long ip, unsigned long addr, int link)
return op;
}

-static int
+static inline int
ftrace_modify_code(unsigned long ip, ppc_inst_t old, ppc_inst_t new)
{
ppc_inst_t replaced;
--
2.35.1


2022-05-09 10:18:46

by Christophe Leroy

[permalink] [raw]
Subject: [PATCH v3 04/25] powerpc/ftrace: Use is_offset_in_branch_range()

Use is_offset_in_branch_range() instead of create_branch()
to check if a target is within branch range.

This patch together with the previous one improves
ftrace activation time by 7%

Signed-off-by: Christophe Leroy <[email protected]>
---
arch/powerpc/kernel/trace/ftrace.c | 8 ++------
1 file changed, 2 insertions(+), 6 deletions(-)

diff --git a/arch/powerpc/kernel/trace/ftrace.c b/arch/powerpc/kernel/trace/ftrace.c
index 3ce3697e8a7c..41c45b9c7f39 100644
--- a/arch/powerpc/kernel/trace/ftrace.c
+++ b/arch/powerpc/kernel/trace/ftrace.c
@@ -89,11 +89,9 @@ ftrace_modify_code(unsigned long ip, ppc_inst_t old, ppc_inst_t new)
*/
static int test_24bit_addr(unsigned long ip, unsigned long addr)
{
- ppc_inst_t op;
addr = ppc_function_entry((void *)addr);

- /* use the create_branch to verify that this offset can be branched */
- return create_branch(&op, (u32 *)ip, addr, 0) == 0;
+ return is_offset_in_branch_range(addr - ip);
}

static int is_bl_op(ppc_inst_t op)
@@ -261,7 +259,6 @@ __ftrace_make_nop(struct module *mod,
static unsigned long find_ftrace_tramp(unsigned long ip)
{
int i;
- ppc_inst_t instr;

/*
* We have the compiler generated long_branch tramps at the end
@@ -270,8 +267,7 @@ static unsigned long find_ftrace_tramp(unsigned long ip)
for (i = NUM_FTRACE_TRAMPS - 1; i >= 0; i--)
if (!ftrace_tramps[i])
continue;
- else if (create_branch(&instr, (void *)ip,
- ftrace_tramps[i], 0) == 0)
+ else if (is_offset_in_branch_range(ftrace_tramps[i] - ip))
return ftrace_tramps[i];

return 0;
--
2.35.1


2022-05-25 02:39:13

by Michael Ellerman

[permalink] [raw]
Subject: Re: [PATCH v3 00/25] powerpc: ftrace optimisation and cleanup and more [v3]

On Mon, 9 May 2022 07:35:58 +0200, Christophe Leroy wrote:
> This series provides optimisation and cleanup of ftrace on powerpc.
>
> With this series ftrace activation is about 20% faster on an 8xx.
>
> At the end of the series come additional cleanups around ppc-opcode,
> that would likely conflict with this series if posted separately.
>
> [...]

Applied to powerpc/next.

[01/25] powerpc/ftrace: Refactor prepare_ftrace_return()
https://git.kernel.org/powerpc/c/d996d5053eb5c0abc0358e5670014a62ada6967e
[02/25] powerpc/ftrace: Remove redundant create_branch() calls
https://git.kernel.org/powerpc/c/ae3a2a2188214adc355a5bdf6deb29120886c96f
[03/25] powerpc/code-patching: Inline is_offset_in_{cond}_branch_range()
https://git.kernel.org/powerpc/c/1acbf27e8a5843911d122ad0008e79ec5f7b6382
[04/25] powerpc/ftrace: Use is_offset_in_branch_range()
https://git.kernel.org/powerpc/c/a1facd2578b312770aaea384adc7de0ed3f543d1
[05/25] powerpc/code-patching: Inline create_branch()
https://git.kernel.org/powerpc/c/d2f47dabf1252520a88d257133e6bdec474fd935
[06/25] powerpc/ftrace: Inline ftrace_modify_code()
https://git.kernel.org/powerpc/c/2c920fca8c70287c4448f2653a388ecca7b32e83
[07/25] powerpc/ftrace: Use patch_instruction() return directly
https://git.kernel.org/powerpc/c/bbffdd2fc743bdc529f9a8264bdb5d3491f58c95
[08/25] powerpc: Add CONFIG_PPC64_ELF_ABI_V1 and CONFIG_PPC64_ELF_ABI_V2
https://git.kernel.org/powerpc/c/661aa880398add5c27943cb077c451a45cc112a1
[09/25] powerpc: Replace PPC64_ELF_ABI_v{1/2} by CONFIG_PPC64_ELF_ABI_V{1/2}
https://git.kernel.org/powerpc/c/7d40aff8213c92e64a1576ba9dfebcd201c0564d
[10/25] powerpc: Finalise cleanup around ABI use
https://git.kernel.org/powerpc/c/5b89492c03e5c0a2c259b97d7d4c1bb9b02860aa
[11/25] powerpc/ftrace: Make __ftrace_make_{nop/call}() common to PPC32 and PPC64
https://git.kernel.org/powerpc/c/23b44fc248f420bbcd0dcd290c3399885360984d
[12/25] powerpc/ftrace: Don't include ftrace.o for CONFIG_FTRACE_SYSCALLS
https://git.kernel.org/powerpc/c/a3d0f5b4b7e425b8abeadda1e76496bda88989bd
[13/25] powerpc/ftrace: Use CONFIG_FUNCTION_TRACER instead of CONFIG_DYNAMIC_FTRACE
https://git.kernel.org/powerpc/c/c2cba93d1a5e2475a636b5cb974da6b73d7a72df
[14/25] powerpc/ftrace: Remove ftrace_plt_tramps[]
https://git.kernel.org/powerpc/c/ccf6607e45aaf5e0ceabfe018aeb01818a936697
[15/25] powerpc/ftrace: Use BRANCH_SET_LINK instead of value 1
https://git.kernel.org/powerpc/c/cf9df92a823ce24c19c4c64b334dc5cadd74fa98
[16/25] powerpc/ftrace: Use PPC_RAW_xxx() macros instead of opencoding.
https://git.kernel.org/powerpc/c/e89aa642be21b14e53bab40a37b8c6b0cf05143d
[17/25] powerpc/ftrace: Use size macro instead of opencoding
https://git.kernel.org/powerpc/c/c8deb28095f9cd2ee2f4d16e948c9e816a22811b
[18/25] powerpc/ftrace: Simplify expected_nop_sequence()
https://git.kernel.org/powerpc/c/b97d0e3dcfba07590ec3d2ca2b95b2f029962d16
[19/25] powerpc/ftrace: Minimise number of #ifdefs
https://git.kernel.org/powerpc/c/af8b9f352ffd435734ab8f94f99ccb922da916b4
[20/25] powerpc/inst: Add __copy_inst_from_kernel_nofault()
https://git.kernel.org/powerpc/c/8dfdbe4368c09d9eeae2df8968ee6c345ec8c1b5
[21/25] powerpc/ftrace: Don't use copy_from_kernel_nofault() in module_trampoline_target()
https://git.kernel.org/powerpc/c/8052d043a48f733905e8ea8f900bf58b441a317f
[22/25] powerpc/inst: Remove PPC_INST_BRANCH
https://git.kernel.org/powerpc/c/4390a58ee1c37dc915dcf44fabe925b160f5bcf0
[23/25] powerpc/modules: Use PPC_LI macros instead of opencoding
https://git.kernel.org/powerpc/c/e0c2ef43210b023ed9a58c520c2fbede7010c592
[24/25] powerpc/inst: Remove PPC_INST_BL
https://git.kernel.org/powerpc/c/ae2c760fa10ba2475aa46fffa6be42050586c604
[25/25] powerpc/opcodes: Remove unused PPC_INST_XXX macros
https://git.kernel.org/powerpc/c/6bdc81eca9519a85d36b3915136640ef9cba1a23

cheers