2023-10-09 13:03:44

by Tiezhu Yang

[permalink] [raw]
Subject: [PATCH v2 0/8] Add objtool and orc support for LoongArch

This version is based on 6.6-rc5, tested with the latest upstream
gcc and binutils (20231009), all of the objtool warnings have been
silenced.

The patches #5, #6 and #7 are based on the following objdump info:
the latest upstream gas of LoongArch replaces a pair of ADD32/64
and SUB32/64 with 32/64_PCREL, and the option -mrelax is used by
default, there are local labels for the branch and jump operation,
the reloc symbol is label + offset instead of section + offset.

The binutils should contain the following two commits:
(1) Use 32/64_PCREL to replace a pair of ADD32/64 and SUB32/64
https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=ecb802d02eeb
(2) as: add option for generate R_LARCH_32/64_PCREL
https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=816029e06768

Tiezhu Yang (8):
objtool/LoongArch: Enable objtool to be built
objtool/LoongArch: Implement instruction decoder
objtool/x86: Separate arch-specific and generic parts
objtool/LoongArch: Enable orc to be built
objtool: Check local label about sibling call
objtool: Check local label in add_dead_ends()
objtool: Check local label in read_unwind_hints()
LoongArch: Add ORC unwinder support

arch/loongarch/Kconfig | 2 +
arch/loongarch/Kconfig.debug | 11 +
arch/loongarch/Makefile | 23 +
arch/loongarch/configs/loongson3_defconfig | 1 +
arch/loongarch/include/asm/Kbuild | 1 +
arch/loongarch/include/asm/bug.h | 1 +
arch/loongarch/include/asm/linkage.h | 2 +
arch/loongarch/include/asm/module.h | 7 +
arch/loongarch/include/asm/orc_header.h | 19 +
arch/loongarch/include/asm/orc_lookup.h | 34 ++
arch/loongarch/include/asm/orc_types.h | 58 +++
arch/loongarch/include/asm/stackframe.h | 3 +
arch/loongarch/include/asm/unwind.h | 22 +-
arch/loongarch/include/asm/unwind_hints.h | 28 +
arch/loongarch/kernel/Makefile | 3 +
arch/loongarch/kernel/entry.S | 9 +-
arch/loongarch/kernel/genex.S | 20 +-
arch/loongarch/kernel/head.S | 1 +
arch/loongarch/kernel/module.c | 11 +-
arch/loongarch/kernel/relocate_kernel.S | 2 +
arch/loongarch/kernel/setup.c | 2 +
arch/loongarch/kernel/stacktrace.c | 1 +
arch/loongarch/kernel/unwind_orc.c | 571 +++++++++++++++++++++
arch/loongarch/kernel/vmlinux.lds.S | 3 +
arch/loongarch/lib/Makefile | 2 +
arch/loongarch/mm/tlbex.S | 45 +-
arch/loongarch/power/Makefile | 2 +
arch/loongarch/vdso/Makefile | 1 +
include/linux/compiler.h | 9 +
scripts/Makefile | 5 +-
tools/arch/loongarch/include/asm/inst.h | 161 ++++++
tools/arch/loongarch/include/asm/orc_types.h | 58 +++
tools/include/linux/bitops.h | 11 +
tools/objtool/Makefile | 4 +
tools/objtool/arch/loongarch/Build | 3 +
tools/objtool/arch/loongarch/decode.c | 334 ++++++++++++
.../objtool/arch/loongarch/include/arch/cfi_regs.h | 21 +
tools/objtool/arch/loongarch/include/arch/elf.h | 30 ++
.../objtool/arch/loongarch/include/arch/special.h | 33 ++
tools/objtool/arch/loongarch/orc.c | 155 ++++++
tools/objtool/arch/loongarch/special.c | 15 +
tools/objtool/arch/x86/Build | 1 +
tools/objtool/arch/x86/orc.c | 169 ++++++
tools/objtool/check.c | 118 +++--
tools/objtool/include/objtool/orc.h | 10 +
tools/objtool/orc_dump.c | 69 +--
tools/objtool/orc_gen.c | 92 +---
47 files changed, 1949 insertions(+), 234 deletions(-)
create mode 100644 arch/loongarch/include/asm/orc_header.h
create mode 100644 arch/loongarch/include/asm/orc_lookup.h
create mode 100644 arch/loongarch/include/asm/orc_types.h
create mode 100644 arch/loongarch/include/asm/unwind_hints.h
create mode 100644 arch/loongarch/kernel/unwind_orc.c
create mode 100644 tools/arch/loongarch/include/asm/inst.h
create mode 100644 tools/arch/loongarch/include/asm/orc_types.h
create mode 100644 tools/objtool/arch/loongarch/Build
create mode 100644 tools/objtool/arch/loongarch/decode.c
create mode 100644 tools/objtool/arch/loongarch/include/arch/cfi_regs.h
create mode 100644 tools/objtool/arch/loongarch/include/arch/elf.h
create mode 100644 tools/objtool/arch/loongarch/include/arch/special.h
create mode 100644 tools/objtool/arch/loongarch/orc.c
create mode 100644 tools/objtool/arch/loongarch/special.c
create mode 100644 tools/objtool/arch/x86/orc.c
create mode 100644 tools/objtool/include/objtool/orc.h

--
2.1.0


2023-10-09 13:03:48

by Tiezhu Yang

[permalink] [raw]
Subject: [PATCH v2 7/8] objtool: Check local label in read_unwind_hints()

When update the latest upstream gcc and binutils which enables linker
relaxation by default, it generates more objtool warnings on LoongArch.

We can see that the reloc sym name is local label instead of section
in relocation section '.rela.discard.unwind_hints', in this case, the
reloc sym type is STT_NOTYPE instead of STT_SECTION. Let us check it
to not return -1, then use reloc->sym->offset instead of reloc addend
which is 0 to find the corresponding instruction.

Here are some detailed info:
[fedora@linux 6.6.test]$ gcc --version
gcc (GCC) 14.0.0 20231009 (experimental)
[fedora@linux 6.6.test]$ as --version
GNU assembler (GNU Binutils) 2.41.50.20231009
[fedora@linux 6.6.test]$ readelf -r arch/loongarch/kernel/entry.o | grep -A 2 "rela.discard.unwind_hints"
Relocation section '.rela.discard.unwind_hints' at offset 0x458 contains 7 entries:
Offset Info Type Sym. Value Sym. Name + Addend
000000000000 000800000063 R_LARCH_32_PCREL 000000000000001c .Lhere_1 + 0

Signed-off-by: Tiezhu Yang <[email protected]>
---
tools/objtool/check.c | 13 ++++++++++++-
1 file changed, 12 insertions(+), 1 deletion(-)

diff --git a/tools/objtool/check.c b/tools/objtool/check.c
index eee5621..607a745 100644
--- a/tools/objtool/check.c
+++ b/tools/objtool/check.c
@@ -2227,6 +2227,7 @@ static int read_unwind_hints(struct objtool_file *file)
struct unwind_hint *hint;
struct instruction *insn;
struct reloc *reloc;
+ unsigned long offset;
int i;

sec = find_section_by_name(file->elf, ".discard.unwind_hints");
@@ -2254,7 +2255,17 @@ static int read_unwind_hints(struct objtool_file *file)
return -1;
}

- insn = find_insn(file, reloc->sym->sec, reloc_addend(reloc));
+ if (reloc->sym->type == STT_SECTION) {
+ offset = reloc_addend(reloc);
+ } else if ((reloc->sym->type == STT_NOTYPE) &&
+ strncmp(reloc->sym->name, ".L", 2) == 0) {
+ offset = reloc->sym->offset;
+ } else {
+ WARN("unexpected relocation symbol type in %s", sec->rsec->name);
+ return -1;
+ }
+
+ insn = find_insn(file, reloc->sym->sec, offset);
if (!insn) {
WARN("can't find insn for unwind_hints[%d]", i);
return -1;
--
2.1.0

2023-10-09 13:03:51

by Tiezhu Yang

[permalink] [raw]
Subject: [PATCH v2 1/8] objtool/LoongArch: Enable objtool to be built

Add the minimal changes to enable objtool build on LoongArch,
most of the functions are stubs to only fix the build errors
when make -C tools/objtool.

This is similar with commit e52ec98c5ab1 ("objtool/powerpc:
Enable objtool to be built on ppc").

Co-developed-by: Jinyang He <[email protected]>
Signed-off-by: Jinyang He <[email protected]>
Co-developed-by: Youling Tang <[email protected]>
Signed-off-by: Youling Tang <[email protected]>
Signed-off-by: Tiezhu Yang <[email protected]>
---
tools/objtool/arch/loongarch/Build | 2 +
tools/objtool/arch/loongarch/decode.c | 71 ++++++++++++++++++++++
.../objtool/arch/loongarch/include/arch/cfi_regs.h | 21 +++++++
tools/objtool/arch/loongarch/include/arch/elf.h | 30 +++++++++
.../objtool/arch/loongarch/include/arch/special.h | 33 ++++++++++
tools/objtool/arch/loongarch/special.c | 15 +++++
6 files changed, 172 insertions(+)
create mode 100644 tools/objtool/arch/loongarch/Build
create mode 100644 tools/objtool/arch/loongarch/decode.c
create mode 100644 tools/objtool/arch/loongarch/include/arch/cfi_regs.h
create mode 100644 tools/objtool/arch/loongarch/include/arch/elf.h
create mode 100644 tools/objtool/arch/loongarch/include/arch/special.h
create mode 100644 tools/objtool/arch/loongarch/special.c

diff --git a/tools/objtool/arch/loongarch/Build b/tools/objtool/arch/loongarch/Build
new file mode 100644
index 0000000..d24d563
--- /dev/null
+++ b/tools/objtool/arch/loongarch/Build
@@ -0,0 +1,2 @@
+objtool-y += decode.o
+objtool-y += special.o
diff --git a/tools/objtool/arch/loongarch/decode.c b/tools/objtool/arch/loongarch/decode.c
new file mode 100644
index 0000000..cc74ba4
--- /dev/null
+++ b/tools/objtool/arch/loongarch/decode.c
@@ -0,0 +1,71 @@
+// SPDX-License-Identifier: GPL-2.0-or-later
+#include <string.h>
+#include <objtool/check.h>
+
+int arch_ftrace_match(char *name)
+{
+ return !strcmp(name, "_mcount");
+}
+
+unsigned long arch_jump_destination(struct instruction *insn)
+{
+ return insn->offset + (insn->immediate << 2);
+}
+
+unsigned long arch_dest_reloc_offset(int addend)
+{
+ return addend;
+}
+
+bool arch_pc_relative_reloc(struct reloc *reloc)
+{
+ return false;
+}
+
+bool arch_callee_saved_reg(unsigned char reg)
+{
+ switch (reg) {
+ case CFI_RA:
+ case CFI_FP:
+ case CFI_S0 ... CFI_S8:
+ return true;
+ default:
+ return false;
+ }
+}
+
+int arch_decode_hint_reg(u8 sp_reg, int *base)
+{
+ return 0;
+}
+
+int arch_decode_instruction(struct objtool_file *file, const struct section *sec,
+ unsigned long offset, unsigned int maxlen,
+ struct instruction *insn)
+{
+ return 0;
+}
+
+const char *arch_nop_insn(int len)
+{
+ return NULL;
+}
+
+const char *arch_ret_insn(int len)
+{
+ return NULL;
+}
+
+void arch_initial_func_cfi_state(struct cfi_init_state *state)
+{
+ int i;
+
+ for (i = 0; i < CFI_NUM_REGS; i++) {
+ state->regs[i].base = CFI_UNDEFINED;
+ state->regs[i].offset = 0;
+ }
+
+ /* initial CFA (call frame address) */
+ state->cfa.base = CFI_SP;
+ state->cfa.offset = 0;
+}
diff --git a/tools/objtool/arch/loongarch/include/arch/cfi_regs.h b/tools/objtool/arch/loongarch/include/arch/cfi_regs.h
new file mode 100644
index 0000000..c768d39
--- /dev/null
+++ b/tools/objtool/arch/loongarch/include/arch/cfi_regs.h
@@ -0,0 +1,21 @@
+/* SPDX-License-Identifier: GPL-2.0-or-later */
+#ifndef _OBJTOOL_ARCH_CFI_REGS_H
+#define _OBJTOOL_ARCH_CFI_REGS_H
+
+#define CFI_RA 1
+#define CFI_SP 3
+#define CFI_FP 22
+#define CFI_S0 23
+#define CFI_S1 24
+#define CFI_S2 25
+#define CFI_S3 26
+#define CFI_S4 27
+#define CFI_S5 28
+#define CFI_S6 29
+#define CFI_S7 30
+#define CFI_S8 31
+#define CFI_NUM_REGS 32
+
+#define CFI_BP CFI_FP
+
+#endif /* _OBJTOOL_ARCH_CFI_REGS_H */
diff --git a/tools/objtool/arch/loongarch/include/arch/elf.h b/tools/objtool/arch/loongarch/include/arch/elf.h
new file mode 100644
index 0000000..9623d66
--- /dev/null
+++ b/tools/objtool/arch/loongarch/include/arch/elf.h
@@ -0,0 +1,30 @@
+/* SPDX-License-Identifier: GPL-2.0-or-later */
+#ifndef _OBJTOOL_ARCH_ELF_H
+#define _OBJTOOL_ARCH_ELF_H
+
+/*
+ * See the following link for more info about ELF Relocation types:
+ * https://loongson.github.io/LoongArch-Documentation/LoongArch-ELF-ABI-EN.html#_relocations
+ */
+#ifndef R_LARCH_NONE
+#define R_LARCH_NONE 0
+#endif
+#ifndef R_LARCH_32
+#define R_LARCH_32 1
+#endif
+#ifndef R_LARCH_64
+#define R_LARCH_64 2
+#endif
+#ifndef R_LARCH_32_PCREL
+#define R_LARCH_32_PCREL 99
+#endif
+
+#define R_NONE R_LARCH_NONE
+#define R_ABS32 R_LARCH_32
+#define R_ABS64 R_LARCH_64
+#define R_DATA32 R_LARCH_32_PCREL
+#define R_DATA64 R_LARCH_32_PCREL
+#define R_TEXT32 R_LARCH_32_PCREL
+#define R_TEXT64 R_LARCH_32_PCREL
+
+#endif /* _OBJTOOL_ARCH_ELF_H */
diff --git a/tools/objtool/arch/loongarch/include/arch/special.h b/tools/objtool/arch/loongarch/include/arch/special.h
new file mode 100644
index 0000000..1a8245c
--- /dev/null
+++ b/tools/objtool/arch/loongarch/include/arch/special.h
@@ -0,0 +1,33 @@
+/* SPDX-License-Identifier: GPL-2.0-or-later */
+#ifndef _OBJTOOL_ARCH_SPECIAL_H
+#define _OBJTOOL_ARCH_SPECIAL_H
+
+/*
+ * See more info about struct exception_table_entry
+ * in arch/loongarch/include/asm/extable.h
+ */
+#define EX_ENTRY_SIZE 12
+#define EX_ORIG_OFFSET 0
+#define EX_NEW_OFFSET 4
+
+/*
+ * See more info about struct jump_entry
+ * in include/linux/jump_label.h
+ */
+#define JUMP_ENTRY_SIZE 16
+#define JUMP_ORIG_OFFSET 0
+#define JUMP_NEW_OFFSET 4
+#define JUMP_KEY_OFFSET 8
+
+/*
+ * See more info about struct alt_instr
+ * in arch/loongarch/include/asm/alternative.h
+ */
+#define ALT_ENTRY_SIZE 12
+#define ALT_ORIG_OFFSET 0
+#define ALT_NEW_OFFSET 4
+#define ALT_FEATURE_OFFSET 8
+#define ALT_ORIG_LEN_OFFSET 10
+#define ALT_NEW_LEN_OFFSET 11
+
+#endif /* _OBJTOOL_ARCH_SPECIAL_H */
diff --git a/tools/objtool/arch/loongarch/special.c b/tools/objtool/arch/loongarch/special.c
new file mode 100644
index 0000000..9bba1e9
--- /dev/null
+++ b/tools/objtool/arch/loongarch/special.c
@@ -0,0 +1,15 @@
+// SPDX-License-Identifier: GPL-2.0-or-later
+#include <objtool/special.h>
+
+bool arch_support_alt_relocation(struct special_alt *special_alt,
+ struct instruction *insn,
+ struct reloc *reloc)
+{
+ return false;
+}
+
+struct reloc *arch_find_switch_table(struct objtool_file *file,
+ struct instruction *insn)
+{
+ return NULL;
+}
--
2.1.0

2023-10-09 13:04:14

by Tiezhu Yang

[permalink] [raw]
Subject: [PATCH v2 3/8] objtool/x86: Separate arch-specific and generic parts

Move init_orc_entry(), reg_name(), orc_type_name() and print_reg() from
generic orc_gen.c and orc_dump.c to arch-specific orc.c, then introduce
a new function orc_print_dump() to print info.

This is preparation for later patch, no functionality change.

Co-developed-by: Jinyang He <[email protected]>
Signed-off-by: Jinyang He <[email protected]>
Co-developed-by: Youling Tang <[email protected]>
Signed-off-by: Youling Tang <[email protected]>
Signed-off-by: Tiezhu Yang <[email protected]>
---
tools/objtool/arch/x86/Build | 1 +
tools/objtool/arch/x86/orc.c | 169 ++++++++++++++++++++++++++++++++++++
tools/objtool/include/objtool/orc.h | 10 +++
tools/objtool/orc_dump.c | 69 +--------------
tools/objtool/orc_gen.c | 92 +-------------------
5 files changed, 183 insertions(+), 158 deletions(-)
create mode 100644 tools/objtool/arch/x86/orc.c
create mode 100644 tools/objtool/include/objtool/orc.h

diff --git a/tools/objtool/arch/x86/Build b/tools/objtool/arch/x86/Build
index 9f7869b..3dedb2f 100644
--- a/tools/objtool/arch/x86/Build
+++ b/tools/objtool/arch/x86/Build
@@ -1,5 +1,6 @@
objtool-y += special.o
objtool-y += decode.o
+objtool-y += orc.o

inat_tables_script = ../arch/x86/tools/gen-insn-attr-x86.awk
inat_tables_maps = ../arch/x86/lib/x86-opcode-map.txt
diff --git a/tools/objtool/arch/x86/orc.c b/tools/objtool/arch/x86/orc.c
new file mode 100644
index 0000000..a4365b8
--- /dev/null
+++ b/tools/objtool/arch/x86/orc.c
@@ -0,0 +1,169 @@
+// SPDX-License-Identifier: GPL-2.0-or-later
+#include <linux/objtool_types.h>
+#include <asm/orc_types.h>
+
+#include <objtool/check.h>
+#include <objtool/orc.h>
+#include <objtool/warn.h>
+#include <objtool/endianness.h>
+
+int init_orc_entry(struct orc_entry *orc, struct cfi_state *cfi, struct instruction *insn)
+{
+ struct cfi_reg *bp = &cfi->regs[CFI_BP];
+
+ memset(orc, 0, sizeof(*orc));
+
+ if (!cfi) {
+ /*
+ * This is usually either unreachable nops/traps (which don't
+ * trigger unreachable instruction warnings), or
+ * STACK_FRAME_NON_STANDARD functions.
+ */
+ orc->type = ORC_TYPE_UNDEFINED;
+ return 0;
+ }
+
+ switch (cfi->type) {
+ case UNWIND_HINT_TYPE_UNDEFINED:
+ orc->type = ORC_TYPE_UNDEFINED;
+ return 0;
+ case UNWIND_HINT_TYPE_END_OF_STACK:
+ orc->type = ORC_TYPE_END_OF_STACK;
+ return 0;
+ case UNWIND_HINT_TYPE_CALL:
+ orc->type = ORC_TYPE_CALL;
+ break;
+ case UNWIND_HINT_TYPE_REGS:
+ orc->type = ORC_TYPE_REGS;
+ break;
+ case UNWIND_HINT_TYPE_REGS_PARTIAL:
+ orc->type = ORC_TYPE_REGS_PARTIAL;
+ break;
+ default:
+ WARN_INSN(insn, "unknown unwind hint type %d", cfi->type);
+ return -1;
+ }
+
+ orc->signal = cfi->signal;
+
+ switch (cfi->cfa.base) {
+ case CFI_SP:
+ orc->sp_reg = ORC_REG_SP;
+ break;
+ case CFI_SP_INDIRECT:
+ orc->sp_reg = ORC_REG_SP_INDIRECT;
+ break;
+ case CFI_BP:
+ orc->sp_reg = ORC_REG_BP;
+ break;
+ case CFI_BP_INDIRECT:
+ orc->sp_reg = ORC_REG_BP_INDIRECT;
+ break;
+ case CFI_R10:
+ orc->sp_reg = ORC_REG_R10;
+ break;
+ case CFI_R13:
+ orc->sp_reg = ORC_REG_R13;
+ break;
+ case CFI_DI:
+ orc->sp_reg = ORC_REG_DI;
+ break;
+ case CFI_DX:
+ orc->sp_reg = ORC_REG_DX;
+ break;
+ default:
+ WARN_INSN(insn, "unknown CFA base reg %d", cfi->cfa.base);
+ return -1;
+ }
+
+ switch (bp->base) {
+ case CFI_UNDEFINED:
+ orc->bp_reg = ORC_REG_UNDEFINED;
+ break;
+ case CFI_CFA:
+ orc->bp_reg = ORC_REG_PREV_SP;
+ break;
+ case CFI_BP:
+ orc->bp_reg = ORC_REG_BP;
+ break;
+ default:
+ WARN_INSN(insn, "unknown BP base reg %d", bp->base);
+ return -1;
+ }
+
+ orc->sp_offset = cfi->cfa.offset;
+ orc->bp_offset = bp->offset;
+
+ return 0;
+}
+
+static const char *reg_name(unsigned int reg)
+{
+ switch (reg) {
+ case ORC_REG_PREV_SP:
+ return "prevsp";
+ case ORC_REG_DX:
+ return "dx";
+ case ORC_REG_DI:
+ return "di";
+ case ORC_REG_BP:
+ return "bp";
+ case ORC_REG_SP:
+ return "sp";
+ case ORC_REG_R10:
+ return "r10";
+ case ORC_REG_R13:
+ return "r13";
+ case ORC_REG_BP_INDIRECT:
+ return "bp(ind)";
+ case ORC_REG_SP_INDIRECT:
+ return "sp(ind)";
+ default:
+ return "?";
+ }
+}
+
+static const char *orc_type_name(unsigned int type)
+{
+ switch (type) {
+ case ORC_TYPE_UNDEFINED:
+ return "(und)";
+ case ORC_TYPE_END_OF_STACK:
+ return "end";
+ case ORC_TYPE_CALL:
+ return "call";
+ case ORC_TYPE_REGS:
+ return "regs";
+ case ORC_TYPE_REGS_PARTIAL:
+ return "regs (partial)";
+ default:
+ return "?";
+ }
+}
+
+static void print_reg(unsigned int reg, int offset)
+{
+ if (reg == ORC_REG_BP_INDIRECT)
+ printf("(bp%+d)", offset);
+ else if (reg == ORC_REG_SP_INDIRECT)
+ printf("(sp)%+d", offset);
+ else if (reg == ORC_REG_UNDEFINED)
+ printf("(und)");
+ else
+ printf("%s%+d", reg_name(reg), offset);
+}
+
+void orc_print_dump(struct elf *dummy_elf, struct orc_entry *orc, int i)
+{
+ printf("type:%s", orc_type_name(orc[i].type));
+
+ printf(" sp:");
+
+ print_reg(orc[i].sp_reg, bswap_if_needed(dummy_elf, orc[i].sp_offset));
+
+ printf(" bp:");
+
+ print_reg(orc[i].bp_reg, bswap_if_needed(dummy_elf, orc[i].bp_offset));
+
+ printf(" signal:%d\n", orc[i].signal);
+}
diff --git a/tools/objtool/include/objtool/orc.h b/tools/objtool/include/objtool/orc.h
new file mode 100644
index 0000000..4c9f946
--- /dev/null
+++ b/tools/objtool/include/objtool/orc.h
@@ -0,0 +1,10 @@
+/* SPDX-License-Identifier: GPL-2.0-or-later */
+#ifndef _OBJTOOL_ORC_H
+#define _OBJTOOL_ORC_H
+
+#include <objtool/check.h>
+
+int init_orc_entry(struct orc_entry *orc, struct cfi_state *cfi, struct instruction *insn);
+void orc_print_dump(struct elf *dummy_elf, struct orc_entry *orc, int i);
+
+#endif /* _OBJTOOL_ORC_H */
diff --git a/tools/objtool/orc_dump.c b/tools/objtool/orc_dump.c
index 0e183bb..a62247ef 100644
--- a/tools/objtool/orc_dump.c
+++ b/tools/objtool/orc_dump.c
@@ -6,65 +6,10 @@
#include <unistd.h>
#include <asm/orc_types.h>
#include <objtool/objtool.h>
+#include <objtool/orc.h>
#include <objtool/warn.h>
#include <objtool/endianness.h>

-static const char *reg_name(unsigned int reg)
-{
- switch (reg) {
- case ORC_REG_PREV_SP:
- return "prevsp";
- case ORC_REG_DX:
- return "dx";
- case ORC_REG_DI:
- return "di";
- case ORC_REG_BP:
- return "bp";
- case ORC_REG_SP:
- return "sp";
- case ORC_REG_R10:
- return "r10";
- case ORC_REG_R13:
- return "r13";
- case ORC_REG_BP_INDIRECT:
- return "bp(ind)";
- case ORC_REG_SP_INDIRECT:
- return "sp(ind)";
- default:
- return "?";
- }
-}
-
-static const char *orc_type_name(unsigned int type)
-{
- switch (type) {
- case ORC_TYPE_UNDEFINED:
- return "(und)";
- case ORC_TYPE_END_OF_STACK:
- return "end";
- case ORC_TYPE_CALL:
- return "call";
- case ORC_TYPE_REGS:
- return "regs";
- case ORC_TYPE_REGS_PARTIAL:
- return "regs (partial)";
- default:
- return "?";
- }
-}
-
-static void print_reg(unsigned int reg, int offset)
-{
- if (reg == ORC_REG_BP_INDIRECT)
- printf("(bp%+d)", offset);
- else if (reg == ORC_REG_SP_INDIRECT)
- printf("(sp)%+d", offset);
- else if (reg == ORC_REG_UNDEFINED)
- printf("(und)");
- else
- printf("%s%+d", reg_name(reg), offset);
-}
-
int orc_dump(const char *_objname)
{
int fd, nr_entries, i, *orc_ip = NULL, orc_size = 0;
@@ -205,17 +150,7 @@ int orc_dump(const char *_objname)
printf("%llx:", (unsigned long long)(orc_ip_addr + (i * sizeof(int)) + orc_ip[i]));
}

- printf("type:%s", orc_type_name(orc[i].type));
-
- printf(" sp:");
-
- print_reg(orc[i].sp_reg, bswap_if_needed(&dummy_elf, orc[i].sp_offset));
-
- printf(" bp:");
-
- print_reg(orc[i].bp_reg, bswap_if_needed(&dummy_elf, orc[i].bp_offset));
-
- printf(" signal:%d\n", orc[i].signal);
+ orc_print_dump(&dummy_elf, orc, i);
}

elf_end(elf);
diff --git a/tools/objtool/orc_gen.c b/tools/objtool/orc_gen.c
index bae3439..1eff7e0a 100644
--- a/tools/objtool/orc_gen.c
+++ b/tools/objtool/orc_gen.c
@@ -10,100 +10,10 @@
#include <asm/orc_types.h>

#include <objtool/check.h>
+#include <objtool/orc.h>
#include <objtool/warn.h>
#include <objtool/endianness.h>

-static int init_orc_entry(struct orc_entry *orc, struct cfi_state *cfi,
- struct instruction *insn)
-{
- struct cfi_reg *bp = &cfi->regs[CFI_BP];
-
- memset(orc, 0, sizeof(*orc));
-
- if (!cfi) {
- /*
- * This is usually either unreachable nops/traps (which don't
- * trigger unreachable instruction warnings), or
- * STACK_FRAME_NON_STANDARD functions.
- */
- orc->type = ORC_TYPE_UNDEFINED;
- return 0;
- }
-
- switch (cfi->type) {
- case UNWIND_HINT_TYPE_UNDEFINED:
- orc->type = ORC_TYPE_UNDEFINED;
- return 0;
- case UNWIND_HINT_TYPE_END_OF_STACK:
- orc->type = ORC_TYPE_END_OF_STACK;
- return 0;
- case UNWIND_HINT_TYPE_CALL:
- orc->type = ORC_TYPE_CALL;
- break;
- case UNWIND_HINT_TYPE_REGS:
- orc->type = ORC_TYPE_REGS;
- break;
- case UNWIND_HINT_TYPE_REGS_PARTIAL:
- orc->type = ORC_TYPE_REGS_PARTIAL;
- break;
- default:
- WARN_INSN(insn, "unknown unwind hint type %d", cfi->type);
- return -1;
- }
-
- orc->signal = cfi->signal;
-
- switch (cfi->cfa.base) {
- case CFI_SP:
- orc->sp_reg = ORC_REG_SP;
- break;
- case CFI_SP_INDIRECT:
- orc->sp_reg = ORC_REG_SP_INDIRECT;
- break;
- case CFI_BP:
- orc->sp_reg = ORC_REG_BP;
- break;
- case CFI_BP_INDIRECT:
- orc->sp_reg = ORC_REG_BP_INDIRECT;
- break;
- case CFI_R10:
- orc->sp_reg = ORC_REG_R10;
- break;
- case CFI_R13:
- orc->sp_reg = ORC_REG_R13;
- break;
- case CFI_DI:
- orc->sp_reg = ORC_REG_DI;
- break;
- case CFI_DX:
- orc->sp_reg = ORC_REG_DX;
- break;
- default:
- WARN_INSN(insn, "unknown CFA base reg %d", cfi->cfa.base);
- return -1;
- }
-
- switch (bp->base) {
- case CFI_UNDEFINED:
- orc->bp_reg = ORC_REG_UNDEFINED;
- break;
- case CFI_CFA:
- orc->bp_reg = ORC_REG_PREV_SP;
- break;
- case CFI_BP:
- orc->bp_reg = ORC_REG_BP;
- break;
- default:
- WARN_INSN(insn, "unknown BP base reg %d", bp->base);
- return -1;
- }
-
- orc->sp_offset = cfi->cfa.offset;
- orc->bp_offset = bp->offset;
-
- return 0;
-}
-
static int write_orc_entry(struct elf *elf, struct section *orc_sec,
struct section *ip_sec, unsigned int idx,
struct section *insn_sec, unsigned long insn_off,
--
2.1.0

2023-10-09 13:04:15

by Tiezhu Yang

[permalink] [raw]
Subject: [PATCH v2 6/8] objtool: Check local label in add_dead_ends()

When update the latest upstream gcc and binutils which enables linker
relaxation by default, it generates more objtool warnings on LoongArch,
like this:

init/main.o: warning: objtool: unexpected relocation symbol type in .rela.discard.unreachable

We can see that the reloc sym name is local label instead of section
in relocation section '.rela.discard.unreachable', in this case, the
reloc sym type is STT_NOTYPE instead of STT_SECTION. Let us check it
to not return -1, then use reloc->sym->offset instead of reloc addend
which is 0 to find the corresponding instruction. At the same time,
replace the variable "addend" with "offset" to reflect the reality.

Here are some detailed info:
[fedora@linux 6.6.test]$ gcc --version
gcc (GCC) 14.0.0 20231009 (experimental)
[fedora@linux 6.6.test]$ as --version
GNU assembler (GNU Binutils) 2.41.50.20231009
[fedora@linux 6.6.test]$ readelf -r init/main.o | grep -A 2 "rela.discard.unreachable"
Relocation section '.rela.discard.unreachable' at offset 0x4b70 contains 1 entry:
Offset Info Type Sym. Value Sym. Name + Addend
000000000000 00de00000063 R_LARCH_32_PCREL 0000000000000228 .L466^B1 + 0

Signed-off-by: Tiezhu Yang <[email protected]>
---
tools/objtool/check.c | 36 +++++++++++++++++++++---------------
1 file changed, 21 insertions(+), 15 deletions(-)

diff --git a/tools/objtool/check.c b/tools/objtool/check.c
index a9cb224..eee5621 100644
--- a/tools/objtool/check.c
+++ b/tools/objtool/check.c
@@ -611,7 +611,7 @@ static int add_dead_ends(struct objtool_file *file)
struct section *rsec;
struct reloc *reloc;
struct instruction *insn;
- s64 addend;
+ unsigned long offset;

/*
* Check for manually annotated dead ends.
@@ -622,26 +622,29 @@ static int add_dead_ends(struct objtool_file *file)

for_each_reloc(rsec, reloc) {

- if (reloc->sym->type != STT_SECTION) {
+ if (reloc->sym->type == STT_SECTION) {
+ offset = reloc_addend(reloc);
+ } else if ((reloc->sym->type == STT_NOTYPE) &&
+ strncmp(reloc->sym->name, ".L", 2) == 0) {
+ offset = reloc->sym->offset;
+ } else {
WARN("unexpected relocation symbol type in %s", rsec->name);
return -1;
}

- addend = reloc_addend(reloc);
-
- insn = find_insn(file, reloc->sym->sec, addend);
+ insn = find_insn(file, reloc->sym->sec, offset);
if (insn)
insn = prev_insn_same_sec(file, insn);
- else if (addend == reloc->sym->sec->sh.sh_size) {
+ else if (offset == reloc->sym->sec->sh.sh_size) {
insn = find_last_insn(file, reloc->sym->sec);
if (!insn) {
WARN("can't find unreachable insn at %s+0x%" PRIx64,
- reloc->sym->sec->name, addend);
+ reloc->sym->sec->name, offset);
return -1;
}
} else {
WARN("can't find unreachable insn at %s+0x%" PRIx64,
- reloc->sym->sec->name, addend);
+ reloc->sym->sec->name, offset);
return -1;
}

@@ -661,26 +664,29 @@ static int add_dead_ends(struct objtool_file *file)

for_each_reloc(rsec, reloc) {

- if (reloc->sym->type != STT_SECTION) {
+ if (reloc->sym->type == STT_SECTION) {
+ offset = reloc_addend(reloc);
+ } else if ((reloc->sym->type == STT_NOTYPE) &&
+ strncmp(reloc->sym->name, ".L", 2) == 0) {
+ offset = reloc->sym->offset;
+ } else {
WARN("unexpected relocation symbol type in %s", rsec->name);
return -1;
}

- addend = reloc_addend(reloc);
-
- insn = find_insn(file, reloc->sym->sec, addend);
+ insn = find_insn(file, reloc->sym->sec, offset);
if (insn)
insn = prev_insn_same_sec(file, insn);
- else if (addend == reloc->sym->sec->sh.sh_size) {
+ else if (offset == reloc->sym->sec->sh.sh_size) {
insn = find_last_insn(file, reloc->sym->sec);
if (!insn) {
WARN("can't find reachable insn at %s+0x%" PRIx64,
- reloc->sym->sec->name, addend);
+ reloc->sym->sec->name, offset);
return -1;
}
} else {
WARN("can't find reachable insn at %s+0x%" PRIx64,
- reloc->sym->sec->name, addend);
+ reloc->sym->sec->name, offset);
return -1;
}

--
2.1.0

2023-10-09 13:04:19

by Tiezhu Yang

[permalink] [raw]
Subject: [PATCH v2 5/8] objtool: Check local label about sibling call

When update the latest upstream gcc and binutils which enables linker
relaxation by default, it generates more objtool warnings on LoongArch,
like this:

init/version.o: warning: objtool: early_hostname+0x20: sibling call from callable instruction with modified stack frame

We can see that the branch and jump operation about local label ".L2"
is not sibling call, because a sibling call is a tail-call to another
symbol. In this case, make is_sibling_call() return false, set dest_sec
and dest_off to calculate jump_dest in add_jump_destinations().

Here are some detailed info:
[fedora@linux 6.6.test]$ gcc --version
gcc (GCC) 14.0.0 20231009 (experimental)
[fedora@linux 6.6.test]$ as --version
GNU assembler (GNU Binutils) 2.41.50.20231009
[fedora@linux 6.6.test]$ objdump -M no-aliases -D init/version.o | grep -A 21 "init.text"
Disassembly of section .init.text:

0000000000000000 <early_hostname>:
0: 1a00000c pcalau12i $t0, 0
4: 02ffc063 addi.d $sp, $sp, -16
8: 00150085 or $a1, $a0, $zero
c: 02810406 addi.w $a2, $zero, 65
10: 02c00184 addi.d $a0, $t0, 0
14: 29c02061 st.d $ra, $sp, 8
18: 54000000 bl 0 # 18 <early_hostname+0x18>
1c: 0281000c addi.w $t0, $zero, 64
20: 6c001584 bgeu $t0, $a0, 20 # 34 <.L2>
24: 1a000004 pcalau12i $a0, 0
28: 02810005 addi.w $a1, $zero, 64
2c: 02c00084 addi.d $a0, $a0, 0
30: 54000000 bl 0 # 30 <early_hostname+0x30>

0000000000000034 <.L2>:
34: 28c02061 ld.d $ra, $sp, 8
38: 00150004 or $a0, $zero, $zero
3c: 02c04063 addi.d $sp, $sp, 16
40: 4c000020 jirl $zero, $ra, 0

By the way, it need to move insn_reloc() before is_sibling_call()
to avoid implicit declaration build error.

Signed-off-by: Tiezhu Yang <[email protected]>
Reviewed-by: Huacai Chen <[email protected]>
---
tools/objtool/check.c | 69 ++++++++++++++++++++++++++++++---------------------
1 file changed, 41 insertions(+), 28 deletions(-)

diff --git a/tools/objtool/check.c b/tools/objtool/check.c
index e308d1b..a9cb224 100644
--- a/tools/objtool/check.c
+++ b/tools/objtool/check.c
@@ -161,12 +161,39 @@ static bool is_jump_table_jump(struct instruction *insn)
insn_jump_table(alt_group->orig_group->first_insn);
}

-static bool is_sibling_call(struct instruction *insn)
+static struct reloc *insn_reloc(struct objtool_file *file, struct instruction *insn)
+{
+ struct reloc *reloc;
+
+ if (insn->no_reloc)
+ return NULL;
+
+ if (!file)
+ return NULL;
+
+ reloc = find_reloc_by_dest_range(file->elf, insn->sec,
+ insn->offset, insn->len);
+ if (!reloc) {
+ insn->no_reloc = 1;
+ return NULL;
+ }
+
+ return reloc;
+}
+
+static bool is_sibling_call(struct objtool_file *file, struct instruction *insn)
{
/*
* Assume only STT_FUNC calls have jump-tables.
*/
if (insn_func(insn)) {
+ struct reloc *reloc = insn_reloc(file, insn);
+
+ /* Disallow sibling calls into STT_NOTYPE if it is local lable */
+ if (reloc && reloc->sym->type == STT_NOTYPE &&
+ strncmp(reloc->sym->name, ".L", 2) == 0)
+ return false;
+
/* An indirect jump is either a sibling call or a jump to a table. */
if (insn->type == INSN_JUMP_DYNAMIC)
return !is_jump_table_jump(insn);
@@ -232,7 +259,7 @@ static bool __dead_end_function(struct objtool_file *file, struct symbol *func,
* of the sibling call returns.
*/
func_for_each_insn(file, func, insn) {
- if (is_sibling_call(insn)) {
+ if (is_sibling_call(file, insn)) {
struct instruction *dest = insn->jump_dest;

if (!dest)
@@ -743,7 +770,7 @@ static int create_static_call_sections(struct objtool_file *file)
if (!elf_init_reloc_data_sym(file->elf, sec,
idx * sizeof(*site) + 4,
(idx * 2) + 1, key_sym,
- is_sibling_call(insn) * STATIC_CALL_SITE_TAIL))
+ is_sibling_call(file, insn) * STATIC_CALL_SITE_TAIL))
return -1;

idx++;
@@ -1315,26 +1342,6 @@ __weak bool arch_is_embedded_insn(struct symbol *sym)
return false;
}

-static struct reloc *insn_reloc(struct objtool_file *file, struct instruction *insn)
-{
- struct reloc *reloc;
-
- if (insn->no_reloc)
- return NULL;
-
- if (!file)
- return NULL;
-
- reloc = find_reloc_by_dest_range(file->elf, insn->sec,
- insn->offset, insn->len);
- if (!reloc) {
- insn->no_reloc = 1;
- return NULL;
- }
-
- return reloc;
-}
-
static void remove_insn_ops(struct instruction *insn)
{
struct stack_op *op, *next;
@@ -1577,8 +1584,14 @@ static int add_jump_destinations(struct objtool_file *file)
* External sibling call or internal sibling call with
* STT_FUNC reloc.
*/
- add_call_dest(file, insn, reloc->sym, true);
- continue;
+ if (reloc->sym->type == STT_NOTYPE &&
+ strncmp(reloc->sym->name, ".L", 2) == 0) {
+ dest_sec = insn->sec;
+ dest_off = arch_jump_destination(insn);
+ } else {
+ add_call_dest(file, insn, reloc->sym, true);
+ continue;
+ }
} else if (reloc->sym->sec->idx) {
dest_sec = reloc->sym->sec;
dest_off = reloc->sym->sym.st_value +
@@ -3674,7 +3687,7 @@ static int validate_branch(struct objtool_file *file, struct symbol *func,

case INSN_JUMP_CONDITIONAL:
case INSN_JUMP_UNCONDITIONAL:
- if (is_sibling_call(insn)) {
+ if (is_sibling_call(file, insn)) {
ret = validate_sibling_call(file, insn, &state);
if (ret)
return ret;
@@ -3695,7 +3708,7 @@ static int validate_branch(struct objtool_file *file, struct symbol *func,

case INSN_JUMP_DYNAMIC:
case INSN_JUMP_DYNAMIC_CONDITIONAL:
- if (is_sibling_call(insn)) {
+ if (is_sibling_call(file, insn)) {
ret = validate_sibling_call(file, insn, &state);
if (ret)
return ret;
@@ -3859,7 +3872,7 @@ static int validate_unret(struct objtool_file *file, struct instruction *insn)

case INSN_JUMP_UNCONDITIONAL:
case INSN_JUMP_CONDITIONAL:
- if (!is_sibling_call(insn)) {
+ if (!is_sibling_call(file, insn)) {
if (!insn->jump_dest) {
WARN_INSN(insn, "unresolved jump target after linking?!?");
return -1;
--
2.1.0

2023-10-09 13:04:24

by Tiezhu Yang

[permalink] [raw]
Subject: [PATCH v2 4/8] objtool/LoongArch: Enable orc to be built

Implement arch-specific init_orc_entry(), reg_name(), orc_type_name(),
print_reg() and orc_print_dump(), then set BUILD_ORC as y to build the
orc related files.

Co-developed-by: Jinyang He <[email protected]>
Signed-off-by: Jinyang He <[email protected]>
Co-developed-by: Youling Tang <[email protected]>
Signed-off-by: Youling Tang <[email protected]>
Signed-off-by: Tiezhu Yang <[email protected]>
---
tools/arch/loongarch/include/asm/orc_types.h | 58 ++++++++++
tools/objtool/Makefile | 4 +
tools/objtool/arch/loongarch/Build | 1 +
tools/objtool/arch/loongarch/decode.c | 16 +++
tools/objtool/arch/loongarch/orc.c | 155 +++++++++++++++++++++++++++
5 files changed, 234 insertions(+)
create mode 100644 tools/arch/loongarch/include/asm/orc_types.h
create mode 100644 tools/objtool/arch/loongarch/orc.c

diff --git a/tools/arch/loongarch/include/asm/orc_types.h b/tools/arch/loongarch/include/asm/orc_types.h
new file mode 100644
index 0000000..1d37e62
--- /dev/null
+++ b/tools/arch/loongarch/include/asm/orc_types.h
@@ -0,0 +1,58 @@
+/* SPDX-License-Identifier: GPL-2.0-or-later */
+#ifndef _ORC_TYPES_H
+#define _ORC_TYPES_H
+
+#include <linux/types.h>
+
+/*
+ * The ORC_REG_* registers are base registers which are used to find other
+ * registers on the stack.
+ *
+ * ORC_REG_PREV_SP, also known as DWARF Call Frame Address (CFA), is the
+ * address of the previous frame: the caller's SP before it called the current
+ * function.
+ *
+ * ORC_REG_UNDEFINED means the corresponding register's value didn't change in
+ * the current frame.
+ *
+ * The most commonly used base registers are SP and BP -- which the previous SP
+ * is usually based on -- and PREV_SP and UNDEFINED -- which the previous BP is
+ * usually based on.
+ *
+ * The rest of the base registers are needed for special cases like entry code
+ * and GCC realigned stacks.
+ */
+#define ORC_REG_UNDEFINED 0
+#define ORC_REG_PREV_SP 1
+#define ORC_REG_SP 2
+#define ORC_REG_BP 3
+#define ORC_REG_MAX 4
+
+#define ORC_TYPE_UNDEFINED 0
+#define ORC_TYPE_END_OF_STACK 1
+#define ORC_TYPE_CALL 2
+#define ORC_TYPE_REGS 3
+#define ORC_TYPE_REGS_PARTIAL 4
+
+#ifndef __ASSEMBLY__
+/*
+ * This struct is more or less a vastly simplified version of the DWARF Call
+ * Frame Information standard. It contains only the necessary parts of DWARF
+ * CFI, simplified for ease of access by the in-kernel unwinder. It tells the
+ * unwinder how to find the previous SP and BP (and sometimes entry regs) on
+ * the stack for a given code address. Each instance of the struct corresponds
+ * to one or more code locations.
+ */
+struct orc_entry {
+ s16 sp_offset;
+ s16 bp_offset;
+ s16 ra_offset;
+ unsigned int sp_reg:4;
+ unsigned int bp_reg:4;
+ unsigned int ra_reg:4;
+ unsigned int type:3;
+ unsigned int signal:1;
+};
+#endif /* __ASSEMBLY__ */
+
+#endif /* _ORC_TYPES_H */
diff --git a/tools/objtool/Makefile b/tools/objtool/Makefile
index 83b100c..bf7f7f8 100644
--- a/tools/objtool/Makefile
+++ b/tools/objtool/Makefile
@@ -57,6 +57,10 @@ ifeq ($(SRCARCH),x86)
BUILD_ORC := y
endif

+ifeq ($(SRCARCH),loongarch)
+ BUILD_ORC := y
+endif
+
export BUILD_ORC
export srctree OUTPUT CFLAGS SRCARCH AWK
include $(srctree)/tools/build/Makefile.include
diff --git a/tools/objtool/arch/loongarch/Build b/tools/objtool/arch/loongarch/Build
index d24d563..1d4b784 100644
--- a/tools/objtool/arch/loongarch/Build
+++ b/tools/objtool/arch/loongarch/Build
@@ -1,2 +1,3 @@
objtool-y += decode.o
objtool-y += special.o
+objtool-y += orc.o
diff --git a/tools/objtool/arch/loongarch/decode.c b/tools/objtool/arch/loongarch/decode.c
index 3a426e4..1c96759 100644
--- a/tools/objtool/arch/loongarch/decode.c
+++ b/tools/objtool/arch/loongarch/decode.c
@@ -3,6 +3,8 @@
#include <objtool/check.h>
#include <objtool/warn.h>
#include <asm/inst.h>
+#include <asm/orc_types.h>
+#include <linux/objtool_types.h>

int arch_ftrace_match(char *name)
{
@@ -38,6 +40,20 @@ bool arch_callee_saved_reg(unsigned char reg)

int arch_decode_hint_reg(u8 sp_reg, int *base)
{
+ switch (sp_reg) {
+ case ORC_REG_UNDEFINED:
+ *base = CFI_UNDEFINED;
+ break;
+ case ORC_REG_SP:
+ *base = CFI_SP;
+ break;
+ case ORC_REG_BP:
+ *base = CFI_FP;
+ break;
+ default:
+ return -1;
+ }
+
return 0;
}

diff --git a/tools/objtool/arch/loongarch/orc.c b/tools/objtool/arch/loongarch/orc.c
new file mode 100644
index 0000000..7d7ecee
--- /dev/null
+++ b/tools/objtool/arch/loongarch/orc.c
@@ -0,0 +1,155 @@
+// SPDX-License-Identifier: GPL-2.0-or-later
+#include <linux/objtool_types.h>
+#include <asm/orc_types.h>
+
+#include <objtool/check.h>
+#include <objtool/orc.h>
+#include <objtool/warn.h>
+#include <objtool/endianness.h>
+
+int init_orc_entry(struct orc_entry *orc, struct cfi_state *cfi, struct instruction *insn)
+{
+ struct cfi_reg *bp = &cfi->regs[CFI_BP];
+ struct cfi_reg *ra = &cfi->regs[CFI_RA];
+
+ memset(orc, 0, sizeof(*orc));
+
+ if (!cfi) {
+ /*
+ * This is usually either unreachable nops/traps (which don't
+ * trigger unreachable instruction warnings), or
+ * STACK_FRAME_NON_STANDARD functions.
+ */
+ orc->type = ORC_TYPE_UNDEFINED;
+ return 0;
+ }
+
+ switch (cfi->type) {
+ case UNWIND_HINT_TYPE_UNDEFINED:
+ orc->type = ORC_TYPE_UNDEFINED;
+ return 0;
+ case UNWIND_HINT_TYPE_END_OF_STACK:
+ orc->type = ORC_TYPE_END_OF_STACK;
+ return 0;
+ case UNWIND_HINT_TYPE_CALL:
+ orc->type = ORC_TYPE_CALL;
+ break;
+ case UNWIND_HINT_TYPE_REGS:
+ orc->type = ORC_TYPE_REGS;
+ break;
+ case UNWIND_HINT_TYPE_REGS_PARTIAL:
+ orc->type = ORC_TYPE_REGS_PARTIAL;
+ break;
+ default:
+ WARN_INSN(insn, "unknown unwind hint type %d", cfi->type);
+ return -1;
+ }
+
+ orc->signal = cfi->signal;
+
+ switch (cfi->cfa.base) {
+ case CFI_SP:
+ orc->sp_reg = ORC_REG_SP;
+ break;
+ case CFI_BP:
+ orc->sp_reg = ORC_REG_BP;
+ break;
+ default:
+ WARN_INSN(insn, "unknown CFA base reg %d", cfi->cfa.base);
+ return -1;
+ }
+
+ switch (bp->base) {
+ case CFI_UNDEFINED:
+ orc->bp_reg = ORC_REG_UNDEFINED;
+ orc->bp_offset = 0;
+ break;
+ case CFI_CFA:
+ orc->bp_reg = ORC_REG_PREV_SP;
+ orc->bp_offset = bp->offset;
+ break;
+ case CFI_BP:
+ orc->bp_reg = ORC_REG_BP;
+ break;
+ default:
+ WARN_INSN(insn, "unknown BP base reg %d", bp->base);
+ return -1;
+ }
+
+ switch (ra->base) {
+ case CFI_UNDEFINED:
+ orc->ra_reg = ORC_REG_UNDEFINED;
+ orc->ra_offset = 0;
+ break;
+ case CFI_CFA:
+ orc->ra_reg = ORC_REG_PREV_SP;
+ orc->ra_offset = ra->offset;
+ break;
+ case CFI_BP:
+ orc->ra_reg = ORC_REG_BP;
+ break;
+ default:
+ WARN_INSN(insn, "unknown RA base reg %d", ra->base);
+ return -1;
+ }
+
+ orc->sp_offset = cfi->cfa.offset;
+
+ return 0;
+}
+
+static const char *reg_name(unsigned int reg)
+{
+ switch (reg) {
+ case ORC_REG_SP:
+ return "sp";
+ case ORC_REG_BP:
+ return "fp";
+ case ORC_REG_PREV_SP:
+ return "prevsp";
+ default:
+ return "?";
+ }
+}
+
+static const char *orc_type_name(unsigned int type)
+{
+ switch (type) {
+ case UNWIND_HINT_TYPE_CALL:
+ return "call";
+ case UNWIND_HINT_TYPE_REGS:
+ return "regs";
+ case UNWIND_HINT_TYPE_REGS_PARTIAL:
+ return "regs (partial)";
+ default:
+ return "?";
+ }
+}
+
+static void print_reg(unsigned int reg, int offset)
+{
+ if (reg == ORC_REG_UNDEFINED)
+ printf(" (und) ");
+ else
+ printf("%s + %3d", reg_name(reg), offset);
+
+}
+
+void orc_print_dump(struct elf *dummy_elf, struct orc_entry *orc, int i)
+{
+ printf("type:%s", orc_type_name(orc[i].type));
+
+ printf(" sp:");
+
+ print_reg(orc[i].sp_reg, bswap_if_needed(dummy_elf, orc[i].sp_offset));
+
+ printf(" bp:");
+
+ print_reg(orc[i].bp_reg, bswap_if_needed(dummy_elf, orc[i].bp_offset));
+
+ printf(" ra:");
+
+ print_reg(orc[i].ra_reg, bswap_if_needed(dummy_elf, orc[i].ra_offset));
+
+ printf(" signal:%d\n", orc[i].signal);
+}
--
2.1.0

2023-10-09 13:04:45

by Tiezhu Yang

[permalink] [raw]
Subject: [PATCH v2 8/8] LoongArch: Add ORC unwinder support

The kernel CONFIG_UNWINDER_ORC option enables the ORC unwinder, which is
similar in concept to a DWARF unwinder. The difference is that the format
of the ORC data is much simpler than DWARF, which in turn allows the ORC
unwinder to be much simpler and faster.

The ORC data consists of unwind tables which are generated by objtool.
They contain out-of-band data which is used by the in-kernel ORC unwinder.
Objtool generates the ORC data by first doing compile-time stack metadata
validation (CONFIG_STACK_VALIDATION). After analyzing all the code paths
of a .o file, it determines information about the stack state at each
instruction address in the file and outputs that information to the
.orc_unwind and .orc_unwind_ip sections.

The per-object ORC sections are combined at link time and are sorted and
post-processed at boot time. The unwinder uses the resulting data to
correlate instruction addresses with their stack states at run time.

Most of the logic are similar with x86, in order to get ra info before ra
is saved into stack, add ra_reg and ra_offset into orc_entry. At the same
time, modify some arch-specific code to silence the objtool warnings.

Co-developed-by: Jinyang He <[email protected]>
Signed-off-by: Jinyang He <[email protected]>
Co-developed-by: Youling Tang <[email protected]>
Signed-off-by: Youling Tang <[email protected]>
Signed-off-by: Tiezhu Yang <[email protected]>
---
arch/loongarch/Kconfig | 2 +
arch/loongarch/Kconfig.debug | 11 +
arch/loongarch/Makefile | 23 ++
arch/loongarch/configs/loongson3_defconfig | 1 +
arch/loongarch/include/asm/Kbuild | 1 +
arch/loongarch/include/asm/bug.h | 1 +
arch/loongarch/include/asm/linkage.h | 2 +
arch/loongarch/include/asm/module.h | 7 +
arch/loongarch/include/asm/orc_header.h | 19 +
arch/loongarch/include/asm/orc_lookup.h | 34 ++
arch/loongarch/include/asm/orc_types.h | 58 +++
arch/loongarch/include/asm/stackframe.h | 3 +
arch/loongarch/include/asm/unwind.h | 22 +-
arch/loongarch/include/asm/unwind_hints.h | 28 ++
arch/loongarch/kernel/Makefile | 3 +
arch/loongarch/kernel/entry.S | 9 +-
arch/loongarch/kernel/genex.S | 20 +-
arch/loongarch/kernel/head.S | 1 +
arch/loongarch/kernel/module.c | 11 +-
arch/loongarch/kernel/relocate_kernel.S | 2 +
arch/loongarch/kernel/setup.c | 2 +
arch/loongarch/kernel/stacktrace.c | 1 +
arch/loongarch/kernel/unwind_orc.c | 571 +++++++++++++++++++++++++++++
arch/loongarch/kernel/vmlinux.lds.S | 3 +
arch/loongarch/lib/Makefile | 2 +
arch/loongarch/mm/tlbex.S | 45 ++-
arch/loongarch/power/Makefile | 2 +
arch/loongarch/vdso/Makefile | 1 +
include/linux/compiler.h | 9 +
scripts/Makefile | 5 +-
30 files changed, 867 insertions(+), 32 deletions(-)
create mode 100644 arch/loongarch/include/asm/orc_header.h
create mode 100644 arch/loongarch/include/asm/orc_lookup.h
create mode 100644 arch/loongarch/include/asm/orc_types.h
create mode 100644 arch/loongarch/include/asm/unwind_hints.h
create mode 100644 arch/loongarch/kernel/unwind_orc.c

diff --git a/arch/loongarch/Kconfig b/arch/loongarch/Kconfig
index e14396a..21ef3bb 100644
--- a/arch/loongarch/Kconfig
+++ b/arch/loongarch/Kconfig
@@ -131,6 +131,7 @@ config LOONGARCH
select HAVE_KRETPROBES
select HAVE_MOD_ARCH_SPECIFIC
select HAVE_NMI
+ select HAVE_OBJTOOL if AS_HAS_EXPLICIT_RELOCS
select HAVE_PCI
select HAVE_PERF_EVENTS
select HAVE_PERF_REGS
@@ -141,6 +142,7 @@ config LOONGARCH
select HAVE_SAMPLE_FTRACE_DIRECT
select HAVE_SAMPLE_FTRACE_DIRECT_MULTI
select HAVE_SETUP_PER_CPU_AREA if NUMA
+ select HAVE_STACK_VALIDATION if HAVE_OBJTOOL
select HAVE_STACKPROTECTOR
select HAVE_SYSCALL_TRACEPOINTS
select HAVE_TIF_NOHZ
diff --git a/arch/loongarch/Kconfig.debug b/arch/loongarch/Kconfig.debug
index 8d36aab..98d6063 100644
--- a/arch/loongarch/Kconfig.debug
+++ b/arch/loongarch/Kconfig.debug
@@ -26,4 +26,15 @@ config UNWINDER_PROLOGUE
Some of the addresses it reports may be incorrect (but better than the
Guess unwinder).

+config UNWINDER_ORC
+ bool "ORC unwinder"
+ select OBJTOOL
+ help
+ This option enables the ORC (Oops Rewind Capability) unwinder for
+ unwinding kernel stack traces. It uses a custom data format which is
+ a simplified version of the DWARF Call Frame Information standard.
+
+ Enabling this option will increase the kernel's runtime memory usage
+ by roughly 2-4MB, depending on your kernel config.
+
endchoice
diff --git a/arch/loongarch/Makefile b/arch/loongarch/Makefile
index fb0fada..89a6e61 100644
--- a/arch/loongarch/Makefile
+++ b/arch/loongarch/Makefile
@@ -25,6 +25,29 @@ endif
32bit-emul = elf32loongarch
64bit-emul = elf64loongarch

+ifdef CONFIG_OBJTOOL
+# https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=ecb802d02eeb
+# https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=816029e06768
+ifeq ($(shell as --help 2>&1 | grep -e '-mthin-add-sub'),)
+ $(error Sorry, you need a newer gas version with -mthin-add-sub option)
+endif
+KBUILD_AFLAGS += $(call cc-option,-mthin-add-sub) $(call cc-option,-Wa$(comma)-mthin-add-sub)
+KBUILD_CFLAGS += $(call cc-option,-mthin-add-sub) $(call cc-option,-Wa$(comma)-mthin-add-sub)
+KBUILD_CFLAGS += -fno-optimize-sibling-calls -fno-jump-tables -falign-functions=4
+endif
+
+ifdef CONFIG_UNWINDER_ORC
+orc_hash_h := arch/$(SRCARCH)/include/generated/asm/orc_hash.h
+orc_hash_sh := $(srctree)/scripts/orc_hash.sh
+targets += $(orc_hash_h)
+quiet_cmd_orc_hash = GEN $@
+ cmd_orc_hash = mkdir -p $(dir $@); \
+ $(CONFIG_SHELL) $(orc_hash_sh) < $< > $@
+$(orc_hash_h): $(srctree)/arch/loongarch/include/asm/orc_types.h $(orc_hash_sh) FORCE
+ $(call if_changed,orc_hash)
+archprepare: $(orc_hash_h)
+endif
+
ifdef CONFIG_DYNAMIC_FTRACE
KBUILD_CPPFLAGS += -DCC_USING_PATCHABLE_FUNCTION_ENTRY
CC_FLAGS_FTRACE := -fpatchable-function-entry=2
diff --git a/arch/loongarch/configs/loongson3_defconfig b/arch/loongarch/configs/loongson3_defconfig
index a3b52aa..de911c3 100644
--- a/arch/loongarch/configs/loongson3_defconfig
+++ b/arch/loongarch/configs/loongson3_defconfig
@@ -5,6 +5,7 @@ CONFIG_NO_HZ=y
CONFIG_HIGH_RES_TIMERS=y
CONFIG_BPF_SYSCALL=y
CONFIG_BPF_JIT=y
+CONFIG_BPF_JIT_ALWAYS_ON=y
CONFIG_PREEMPT=y
CONFIG_BSD_PROCESS_ACCT=y
CONFIG_BSD_PROCESS_ACCT_V3=y
diff --git a/arch/loongarch/include/asm/Kbuild b/arch/loongarch/include/asm/Kbuild
index 93783fa..2bb285c 100644
--- a/arch/loongarch/include/asm/Kbuild
+++ b/arch/loongarch/include/asm/Kbuild
@@ -1,4 +1,5 @@
# SPDX-License-Identifier: GPL-2.0
+generated-y += orc_hash.h
generic-y += dma-contiguous.h
generic-y += mcs_spinlock.h
generic-y += parport.h
diff --git a/arch/loongarch/include/asm/bug.h b/arch/loongarch/include/asm/bug.h
index d4ca3ba..0838887 100644
--- a/arch/loongarch/include/asm/bug.h
+++ b/arch/loongarch/include/asm/bug.h
@@ -44,6 +44,7 @@
do { \
instrumentation_begin(); \
__BUG_FLAGS(BUGFLAG_WARNING|(flags)); \
+ annotate_reachable(); \
instrumentation_end(); \
} while (0)

diff --git a/arch/loongarch/include/asm/linkage.h b/arch/loongarch/include/asm/linkage.h
index 81b0c4c..ae4e100 100644
--- a/arch/loongarch/include/asm/linkage.h
+++ b/arch/loongarch/include/asm/linkage.h
@@ -2,6 +2,8 @@
#ifndef __ASM_LINKAGE_H
#define __ASM_LINKAGE_H

+#include <asm/unwind_hints.h>
+
#define __ALIGN .align 2
#define __ALIGN_STR __stringify(__ALIGN)

diff --git a/arch/loongarch/include/asm/module.h b/arch/loongarch/include/asm/module.h
index 2ecd82b..96af0ba 100644
--- a/arch/loongarch/include/asm/module.h
+++ b/arch/loongarch/include/asm/module.h
@@ -6,6 +6,7 @@
#define _ASM_MODULE_H

#include <asm/inst.h>
+#include <asm/orc_types.h>
#include <asm-generic/module.h>

#define RELA_STACK_DEPTH 16
@@ -23,6 +24,12 @@ struct mod_arch_specific {

/* For CONFIG_DYNAMIC_FTRACE */
struct plt_entry *ftrace_trampolines;
+
+#ifdef CONFIG_UNWINDER_ORC
+ unsigned int num_orcs;
+ int *orc_unwind_ip;
+ struct orc_entry *orc_unwind;
+#endif
};

struct got_entry {
diff --git a/arch/loongarch/include/asm/orc_header.h b/arch/loongarch/include/asm/orc_header.h
new file mode 100644
index 0000000..07bacf3
--- /dev/null
+++ b/arch/loongarch/include/asm/orc_header.h
@@ -0,0 +1,19 @@
+/* SPDX-License-Identifier: GPL-2.0-or-later */
+/* Copyright (c) Meta Platforms, Inc. and affiliates. */
+
+#ifndef _ORC_HEADER_H
+#define _ORC_HEADER_H
+
+#include <linux/types.h>
+#include <linux/compiler.h>
+#include <asm/orc_hash.h>
+
+/*
+ * The header is currently a 20-byte hash of the ORC entry definition; see
+ * scripts/orc_hash.sh.
+ */
+#define ORC_HEADER \
+ __used __section(".orc_header") __aligned(4) \
+ static const u8 orc_header[] = { ORC_HASH }
+
+#endif /* _ORC_HEADER_H */
diff --git a/arch/loongarch/include/asm/orc_lookup.h b/arch/loongarch/include/asm/orc_lookup.h
new file mode 100644
index 0000000..2416312
--- /dev/null
+++ b/arch/loongarch/include/asm/orc_lookup.h
@@ -0,0 +1,34 @@
+/* SPDX-License-Identifier: GPL-2.0-or-later */
+/*
+ * Copyright (C) 2017 Josh Poimboeuf <[email protected]>
+ */
+#ifndef _ORC_LOOKUP_H
+#define _ORC_LOOKUP_H
+
+/*
+ * This is a lookup table for speeding up access to the .orc_unwind table.
+ * Given an input address offset, the corresponding lookup table entry
+ * specifies a subset of the .orc_unwind table to search.
+ *
+ * Each block represents the end of the previous range and the start of the
+ * next range. An extra block is added to give the last range an end.
+ *
+ * The block size should be a power of 2 to avoid a costly 'div' instruction.
+ *
+ * A block size of 256 was chosen because it roughly doubles unwinder
+ * performance while only adding ~5% to the ORC data footprint.
+ */
+#define LOOKUP_BLOCK_ORDER 8
+#define LOOKUP_BLOCK_SIZE (1 << LOOKUP_BLOCK_ORDER)
+
+#ifndef LINKER_SCRIPT
+
+extern unsigned int orc_lookup[];
+extern unsigned int orc_lookup_end[];
+
+#define LOOKUP_START_IP (unsigned long)_stext
+#define LOOKUP_STOP_IP (unsigned long)_etext
+
+#endif /* LINKER_SCRIPT */
+
+#endif /* _ORC_LOOKUP_H */
diff --git a/arch/loongarch/include/asm/orc_types.h b/arch/loongarch/include/asm/orc_types.h
new file mode 100644
index 0000000..1d37e62
--- /dev/null
+++ b/arch/loongarch/include/asm/orc_types.h
@@ -0,0 +1,58 @@
+/* SPDX-License-Identifier: GPL-2.0-or-later */
+#ifndef _ORC_TYPES_H
+#define _ORC_TYPES_H
+
+#include <linux/types.h>
+
+/*
+ * The ORC_REG_* registers are base registers which are used to find other
+ * registers on the stack.
+ *
+ * ORC_REG_PREV_SP, also known as DWARF Call Frame Address (CFA), is the
+ * address of the previous frame: the caller's SP before it called the current
+ * function.
+ *
+ * ORC_REG_UNDEFINED means the corresponding register's value didn't change in
+ * the current frame.
+ *
+ * The most commonly used base registers are SP and BP -- which the previous SP
+ * is usually based on -- and PREV_SP and UNDEFINED -- which the previous BP is
+ * usually based on.
+ *
+ * The rest of the base registers are needed for special cases like entry code
+ * and GCC realigned stacks.
+ */
+#define ORC_REG_UNDEFINED 0
+#define ORC_REG_PREV_SP 1
+#define ORC_REG_SP 2
+#define ORC_REG_BP 3
+#define ORC_REG_MAX 4
+
+#define ORC_TYPE_UNDEFINED 0
+#define ORC_TYPE_END_OF_STACK 1
+#define ORC_TYPE_CALL 2
+#define ORC_TYPE_REGS 3
+#define ORC_TYPE_REGS_PARTIAL 4
+
+#ifndef __ASSEMBLY__
+/*
+ * This struct is more or less a vastly simplified version of the DWARF Call
+ * Frame Information standard. It contains only the necessary parts of DWARF
+ * CFI, simplified for ease of access by the in-kernel unwinder. It tells the
+ * unwinder how to find the previous SP and BP (and sometimes entry regs) on
+ * the stack for a given code address. Each instance of the struct corresponds
+ * to one or more code locations.
+ */
+struct orc_entry {
+ s16 sp_offset;
+ s16 bp_offset;
+ s16 ra_offset;
+ unsigned int sp_reg:4;
+ unsigned int bp_reg:4;
+ unsigned int ra_reg:4;
+ unsigned int type:3;
+ unsigned int signal:1;
+};
+#endif /* __ASSEMBLY__ */
+
+#endif /* _ORC_TYPES_H */
diff --git a/arch/loongarch/include/asm/stackframe.h b/arch/loongarch/include/asm/stackframe.h
index 4fb1e64..45b507a 100644
--- a/arch/loongarch/include/asm/stackframe.h
+++ b/arch/loongarch/include/asm/stackframe.h
@@ -13,6 +13,7 @@
#include <asm/asm-offsets.h>
#include <asm/loongarch.h>
#include <asm/thread_info.h>
+#include <asm/unwind_hints.h>

/* Make the addition of cfi info a little easier. */
.macro cfi_rel_offset reg offset=0 docfi=0
@@ -162,6 +163,7 @@
li.w t0, CSR_CRMD_WE
csrxchg t0, t0, LOONGARCH_CSR_CRMD
#endif
+ UNWIND_HINT_REGS
.endm

.macro SAVE_ALL docfi=0
@@ -219,6 +221,7 @@

.macro RESTORE_SP_AND_RET docfi=0
cfi_ld sp, PT_R3, \docfi
+ UNWIND_HINT_FUNC
ertn
.endm

diff --git a/arch/loongarch/include/asm/unwind.h b/arch/loongarch/include/asm/unwind.h
index b9dce87..d36e04e 100644
--- a/arch/loongarch/include/asm/unwind.h
+++ b/arch/loongarch/include/asm/unwind.h
@@ -16,6 +16,7 @@
enum unwinder_type {
UNWINDER_GUESS,
UNWINDER_PROLOGUE,
+ UNWINDER_ORC,
};

struct unwind_state {
@@ -24,7 +25,7 @@ struct unwind_state {
struct task_struct *task;
bool first, error, reset;
int graph_idx;
- unsigned long sp, pc, ra;
+ unsigned long sp, pc, ra, fp;
};

bool default_next_frame(struct unwind_state *state);
@@ -34,6 +35,17 @@ void unwind_start(struct unwind_state *state,
bool unwind_next_frame(struct unwind_state *state);
unsigned long unwind_get_return_address(struct unwind_state *state);

+#ifdef CONFIG_UNWINDER_ORC
+void unwind_init(void);
+void unwind_module_init(struct module *mod, void *orc_ip, size_t orc_ip_size,
+ void *orc, size_t orc_size);
+#else
+static inline void unwind_init(void) {}
+static inline
+void unwind_module_init(struct module *mod, void *orc_ip, size_t orc_ip_size,
+ void *orc, size_t orc_size) {}
+#endif
+
static inline bool unwind_done(struct unwind_state *state)
{
return state->stack_info.type == STACK_TYPE_UNKNOWN;
@@ -61,14 +73,17 @@ static __always_inline void __unwind_start(struct unwind_state *state,
state->sp = regs->regs[3];
state->pc = regs->csr_era;
state->ra = regs->regs[1];
+ state->fp = regs->regs[22];
} else if (task && task != current) {
state->sp = thread_saved_fp(task);
state->pc = thread_saved_ra(task);
state->ra = 0;
+ state->fp = 0;
} else {
state->sp = (unsigned long)__builtin_frame_address(0);
state->pc = (unsigned long)__builtin_return_address(0);
state->ra = 0;
+ state->fp = 0;
}
state->task = task;
get_stack_info(state->sp, state->task, &state->stack_info);
@@ -77,6 +92,9 @@ static __always_inline void __unwind_start(struct unwind_state *state,

static __always_inline unsigned long __unwind_get_return_address(struct unwind_state *state)
{
- return unwind_done(state) ? 0 : state->pc;
+ if (unwind_done(state))
+ return 0;
+
+ return __kernel_text_address(state->pc) ? state->pc : 0;
}
#endif /* _ASM_UNWIND_H */
diff --git a/arch/loongarch/include/asm/unwind_hints.h b/arch/loongarch/include/asm/unwind_hints.h
new file mode 100644
index 0000000..82443fe
--- /dev/null
+++ b/arch/loongarch/include/asm/unwind_hints.h
@@ -0,0 +1,28 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _ASM_LOONGARCH_UNWIND_HINTS_H
+#define _ASM_LOONGARCH_UNWIND_HINTS_H
+
+#include <linux/objtool.h>
+#include <asm/orc_types.h>
+
+#ifdef __ASSEMBLY__
+
+.macro UNWIND_HINT_UNDEFINED
+ UNWIND_HINT type=UNWIND_HINT_TYPE_UNDEFINED
+.endm
+
+.macro UNWIND_HINT_EMPTY
+ UNWIND_HINT sp_reg=ORC_REG_UNDEFINED type=UNWIND_HINT_TYPE_CALL
+.endm
+
+.macro UNWIND_HINT_REGS
+ UNWIND_HINT sp_reg=ORC_REG_SP type=UNWIND_HINT_TYPE_REGS
+.endm
+
+.macro UNWIND_HINT_FUNC
+ UNWIND_HINT sp_reg=ORC_REG_SP type=UNWIND_HINT_TYPE_CALL
+.endm
+
+#endif /* __ASSEMBLY__ */
+
+#endif /* _ASM_LOONGARCH_UNWIND_HINTS_H */
diff --git a/arch/loongarch/kernel/Makefile b/arch/loongarch/kernel/Makefile
index 4fcc168..a89428c 100644
--- a/arch/loongarch/kernel/Makefile
+++ b/arch/loongarch/kernel/Makefile
@@ -3,6 +3,8 @@
# Makefile for the Linux/LoongArch kernel.
#

+OBJECT_FILES_NON_STANDARD_head.o := y
+
extra-y := vmlinux.lds

obj-y += head.o cpu-probe.o cacheinfo.o env.o setup.o entry.o genex.o \
@@ -62,6 +64,7 @@ obj-$(CONFIG_CRASH_DUMP) += crash_dump.o

obj-$(CONFIG_UNWINDER_GUESS) += unwind_guess.o
obj-$(CONFIG_UNWINDER_PROLOGUE) += unwind_prologue.o
+obj-$(CONFIG_UNWINDER_ORC) += unwind_orc.o

obj-$(CONFIG_PERF_EVENTS) += perf_event.o perf_regs.o
obj-$(CONFIG_HAVE_HW_BREAKPOINT) += hw_breakpoint.o
diff --git a/arch/loongarch/kernel/entry.S b/arch/loongarch/kernel/entry.S
index 65518bb..e43115f 100644
--- a/arch/loongarch/kernel/entry.S
+++ b/arch/loongarch/kernel/entry.S
@@ -14,11 +14,13 @@
#include <asm/regdef.h>
#include <asm/stackframe.h>
#include <asm/thread_info.h>
+#include <asm/unwind_hints.h>

.text
.cfi_sections .debug_frame
.align 5
-SYM_FUNC_START(handle_syscall)
+SYM_CODE_START(handle_syscall)
+ UNWIND_HINT_UNDEFINED
csrrd t0, PERCPU_BASE_KS
la.pcrel t1, kernelsp
add.d t1, t1, t0
@@ -56,6 +58,7 @@ SYM_FUNC_START(handle_syscall)
cfi_st u0, PT_R21
cfi_st fp, PT_R22

+ UNWIND_HINT_REGS
SAVE_STATIC

#ifdef CONFIG_KGDB
@@ -71,10 +74,11 @@ SYM_FUNC_START(handle_syscall)
bl do_syscall

RESTORE_ALL_AND_RET
-SYM_FUNC_END(handle_syscall)
+SYM_CODE_END(handle_syscall)
_ASM_NOKPROBE(handle_syscall)

SYM_CODE_START(ret_from_fork)
+ UNWIND_HINT_REGS
bl schedule_tail # a0 = struct task_struct *prev
move a0, sp
bl syscall_exit_to_user_mode
@@ -84,6 +88,7 @@ SYM_CODE_START(ret_from_fork)
SYM_CODE_END(ret_from_fork)

SYM_CODE_START(ret_from_kernel_thread)
+ UNWIND_HINT_REGS
bl schedule_tail # a0 = struct task_struct *prev
move a0, s1
jirl ra, s0, 0
diff --git a/arch/loongarch/kernel/genex.S b/arch/loongarch/kernel/genex.S
index 78f0663..3f18e3b 100644
--- a/arch/loongarch/kernel/genex.S
+++ b/arch/loongarch/kernel/genex.S
@@ -31,7 +31,8 @@ SYM_FUNC_START(__arch_cpu_idle)
1: jr ra
SYM_FUNC_END(__arch_cpu_idle)

-SYM_FUNC_START(handle_vint)
+SYM_CODE_START(handle_vint)
+ UNWIND_HINT_UNDEFINED
BACKUP_T0T1
SAVE_ALL
la_abs t1, __arch_cpu_idle
@@ -46,11 +47,12 @@ SYM_FUNC_START(handle_vint)
la_abs t0, do_vint
jirl ra, t0, 0
RESTORE_ALL_AND_RET
-SYM_FUNC_END(handle_vint)
+SYM_CODE_END(handle_vint)

-SYM_FUNC_START(except_vec_cex)
+SYM_CODE_START(except_vec_cex)
+ UNWIND_HINT_UNDEFINED
b cache_parity_error
-SYM_FUNC_END(except_vec_cex)
+SYM_CODE_END(except_vec_cex)

.macro build_prep_badv
csrrd t0, LOONGARCH_CSR_BADV
@@ -66,7 +68,8 @@ SYM_FUNC_END(except_vec_cex)

.macro BUILD_HANDLER exception handler prep
.align 5
- SYM_FUNC_START(handle_\exception)
+ SYM_CODE_START(handle_\exception)
+ UNWIND_HINT_UNDEFINED
666:
BACKUP_T0T1
SAVE_ALL
@@ -76,7 +79,7 @@ SYM_FUNC_END(except_vec_cex)
jirl ra, t0, 0
668:
RESTORE_ALL_AND_RET
- SYM_FUNC_END(handle_\exception)
+ SYM_CODE_END(handle_\exception)
SYM_DATA(unwind_hint_\exception, .word 668b - 666b)
.endm

@@ -93,7 +96,8 @@ SYM_FUNC_END(except_vec_cex)
BUILD_HANDLER watch watch none
BUILD_HANDLER reserved reserved none /* others */

-SYM_FUNC_START(handle_sys)
+SYM_CODE_START(handle_sys)
+ UNWIND_HINT_UNDEFINED
la_abs t0, handle_syscall
jr t0
-SYM_FUNC_END(handle_sys)
+SYM_CODE_END(handle_sys)
diff --git a/arch/loongarch/kernel/head.S b/arch/loongarch/kernel/head.S
index 53b883d..5664390 100644
--- a/arch/loongarch/kernel/head.S
+++ b/arch/loongarch/kernel/head.S
@@ -43,6 +43,7 @@ SYM_DATA(kernel_offset, .long _kernel_offset);
.align 12

SYM_CODE_START(kernel_entry) # kernel entry point
+ UNWIND_HINT_EMPTY

/* Config direct window and set PG */
li.d t0, CSR_DMW0_INIT # UC, PLV0, 0x8000 xxxx xxxx xxxx
diff --git a/arch/loongarch/kernel/module.c b/arch/loongarch/kernel/module.c
index b13b285..83db7e5 100644
--- a/arch/loongarch/kernel/module.c
+++ b/arch/loongarch/kernel/module.c
@@ -20,6 +20,7 @@
#include <linux/kernel.h>
#include <asm/alternative.h>
#include <asm/inst.h>
+#include <asm/unwind.h>

static int rela_stack_push(s64 stack_value, s64 *rela_stack, size_t *rela_stack_top)
{
@@ -515,7 +516,7 @@ static void module_init_ftrace_plt(const Elf_Ehdr *hdr,
int module_finalize(const Elf_Ehdr *hdr,
const Elf_Shdr *sechdrs, struct module *mod)
{
- const Elf_Shdr *s, *se;
+ const Elf_Shdr *s, *se, *orc = NULL, *orc_ip = NULL;
const char *secstrs = (void *)hdr + sechdrs[hdr->e_shstrndx].sh_offset;

for (s = sechdrs, se = sechdrs + hdr->e_shnum; s < se; s++) {
@@ -523,7 +524,15 @@ int module_finalize(const Elf_Ehdr *hdr,
apply_alternatives((void *)s->sh_addr, (void *)s->sh_addr + s->sh_size);
if (!strcmp(".ftrace_trampoline", secstrs + s->sh_name))
module_init_ftrace_plt(hdr, s, mod);
+ if (!strcmp(".orc_unwind", secstrs + s->sh_name))
+ orc = s;
+ if (!strcmp(".orc_unwind_ip", secstrs + s->sh_name))
+ orc_ip = s;
}

+ if (orc && orc_ip)
+ unwind_module_init(mod, (void *)orc_ip->sh_addr, orc_ip->sh_size,
+ (void *)orc->sh_addr, orc->sh_size);
+
return 0;
}
diff --git a/arch/loongarch/kernel/relocate_kernel.S b/arch/loongarch/kernel/relocate_kernel.S
index f49f6b0..bcc191d 100644
--- a/arch/loongarch/kernel/relocate_kernel.S
+++ b/arch/loongarch/kernel/relocate_kernel.S
@@ -15,6 +15,7 @@
#include <asm/addrspace.h>

SYM_CODE_START(relocate_new_kernel)
+ UNWIND_HINT_UNDEFINED
/*
* a0: EFI boot flag for the new kernel
* a1: Command line pointer for the new kernel
@@ -90,6 +91,7 @@ SYM_CODE_END(relocate_new_kernel)
* then start at the entry point from LOONGARCH_IOCSR_MBUF0.
*/
SYM_CODE_START(kexec_smp_wait)
+ UNWIND_HINT_UNDEFINED
1: li.w t0, 0x100 /* wait for init loop */
2: addi.w t0, t0, -1 /* limit mailbox access */
bnez t0, 2b
diff --git a/arch/loongarch/kernel/setup.c b/arch/loongarch/kernel/setup.c
index 7783f0a..a173b02 100644
--- a/arch/loongarch/kernel/setup.c
+++ b/arch/loongarch/kernel/setup.c
@@ -48,6 +48,7 @@
#include <asm/sections.h>
#include <asm/setup.h>
#include <asm/time.h>
+#include <asm/unwind.h>

#define SMBIOS_BIOSSIZE_OFFSET 0x09
#define SMBIOS_BIOSEXTERN_OFFSET 0x13
@@ -605,6 +606,7 @@ static void __init prefill_possible_map(void)

void __init setup_arch(char **cmdline_p)
{
+ unwind_init();
cpu_probe();

init_environ();
diff --git a/arch/loongarch/kernel/stacktrace.c b/arch/loongarch/kernel/stacktrace.c
index 92270f1..9848d42 100644
--- a/arch/loongarch/kernel/stacktrace.c
+++ b/arch/loongarch/kernel/stacktrace.c
@@ -29,6 +29,7 @@ void arch_stack_walk(stack_trace_consume_fn consume_entry, void *cookie,
regs->csr_era = thread_saved_ra(task);
}
regs->regs[1] = 0;
+ regs->regs[22] = 0;
}

for (unwind_start(&state, task, regs);
diff --git a/arch/loongarch/kernel/unwind_orc.c b/arch/loongarch/kernel/unwind_orc.c
new file mode 100644
index 0000000..08f80ca0
--- /dev/null
+++ b/arch/loongarch/kernel/unwind_orc.c
@@ -0,0 +1,571 @@
+// SPDX-License-Identifier: GPL-2.0-only
+#include <linux/objtool.h>
+#include <linux/module.h>
+#include <linux/sort.h>
+#include <asm/exception.h>
+#include <asm/orc_types.h>
+#include <asm/orc_lookup.h>
+#include <asm/orc_header.h>
+#include <asm/ptrace.h>
+#include <asm/setup.h>
+#include <asm/stacktrace.h>
+#include <asm/tlb.h>
+#include <asm/unwind.h>
+
+ORC_HEADER;
+
+#define orc_warn(fmt, ...) \
+ printk_deferred_once(KERN_WARNING "WARNING: " fmt, ##__VA_ARGS__)
+
+extern int __start_orc_unwind_ip[];
+extern int __stop_orc_unwind_ip[];
+extern struct orc_entry __start_orc_unwind[];
+extern struct orc_entry __stop_orc_unwind[];
+
+static bool orc_init __ro_after_init;
+static unsigned int lookup_num_blocks __ro_after_init;
+
+/* Fake frame pointer entry -- used as a fallback for generated code */
+static struct orc_entry orc_fp_entry = {
+ .type = UNWIND_HINT_TYPE_CALL,
+ .sp_reg = ORC_REG_BP,
+ .sp_offset = 16,
+ .bp_reg = ORC_REG_PREV_SP,
+ .bp_offset = -16,
+ .ra_reg = ORC_REG_PREV_SP,
+ .ra_offset = -8,
+};
+
+static inline unsigned long orc_ip(const int *ip)
+{
+ return (unsigned long)ip + *ip;
+}
+
+static struct orc_entry *__orc_find(int *ip_table, struct orc_entry *u_table,
+ unsigned int num_entries, unsigned long ip)
+{
+ int *first = ip_table;
+ int *last = ip_table + num_entries - 1;
+ int *mid = first, *found = first;
+
+ if (!num_entries)
+ return NULL;
+
+ /*
+ * Do a binary range search to find the rightmost duplicate of a given
+ * starting address. Some entries are section terminators which are
+ * "weak" entries for ensuring there are no gaps. They should be
+ * ignored when they conflict with a real entry.
+ */
+ while (first <= last) {
+ mid = first + ((last - first) / 2);
+
+ if (orc_ip(mid) <= ip) {
+ found = mid;
+ first = mid + 1;
+ } else
+ last = mid - 1;
+ }
+
+ return u_table + (found - ip_table);
+}
+
+#ifdef CONFIG_MODULES
+static struct orc_entry *orc_module_find(unsigned long ip)
+{
+ struct module *mod;
+
+ mod = __module_address(ip);
+ if (!mod || !mod->arch.orc_unwind || !mod->arch.orc_unwind_ip)
+ return NULL;
+ return __orc_find(mod->arch.orc_unwind_ip, mod->arch.orc_unwind,
+ mod->arch.num_orcs, ip);
+}
+#else
+static struct orc_entry *orc_module_find(unsigned long ip)
+{
+ return NULL;
+}
+#endif
+
+#ifdef CONFIG_DYNAMIC_FTRACE
+static struct orc_entry *orc_find(unsigned long ip);
+
+/*
+ * Ftrace dynamic trampolines do not have orc entries of their own.
+ * But they are copies of the ftrace entries that are static and
+ * defined in ftrace_*.S, which do have orc entries.
+ *
+ * If the unwinder comes across a ftrace trampoline, then find the
+ * ftrace function that was used to create it, and use that ftrace
+ * function's orc entry, as the placement of the return code in
+ * the stack will be identical.
+ */
+static struct orc_entry *orc_ftrace_find(unsigned long ip)
+{
+ struct ftrace_ops *ops;
+ unsigned long tramp_addr, offset;
+
+ ops = ftrace_ops_trampoline(ip);
+ if (!ops)
+ return NULL;
+
+ /* Set tramp_addr to the start of the code copied by the trampoline */
+ if (ops->flags & FTRACE_OPS_FL_SAVE_REGS)
+ tramp_addr = (unsigned long)ftrace_regs_caller;
+ else
+ tramp_addr = (unsigned long)ftrace_caller;
+
+ /* Now place tramp_addr to the location within the trampoline ip is at */
+ offset = ip - ops->trampoline;
+ tramp_addr += offset;
+
+ /* Prevent unlikely recursion */
+ if (ip == tramp_addr)
+ return NULL;
+
+ return orc_find(tramp_addr);
+}
+#else
+static struct orc_entry *orc_ftrace_find(unsigned long ip)
+{
+ return NULL;
+}
+#endif
+
+/*
+ * If we crash with IP==0, the last successfully executed instruction
+ * was probably an indirect function call with a NULL function pointer,
+ * and we don't have unwind information for NULL.
+ * This hardcoded ORC entry for IP==0 allows us to unwind from a NULL function
+ * pointer into its parent and then continue normally from there.
+ */
+static struct orc_entry null_orc_entry = {
+ .sp_offset = sizeof(long),
+ .sp_reg = ORC_REG_SP,
+ .bp_reg = ORC_REG_UNDEFINED,
+ .type = ORC_TYPE_CALL
+};
+
+static struct orc_entry *orc_find(unsigned long ip)
+{
+ static struct orc_entry *orc;
+
+ if (ip == 0)
+ return &null_orc_entry;
+
+ /* For non-init vmlinux addresses, use the fast lookup table: */
+ if (ip >= LOOKUP_START_IP && ip < LOOKUP_STOP_IP) {
+ unsigned int idx, start, stop;
+
+ idx = (ip - LOOKUP_START_IP) / LOOKUP_BLOCK_SIZE;
+
+ if (unlikely((idx >= lookup_num_blocks-1))) {
+ orc_warn("WARNING: bad lookup idx: idx=%u num=%u ip=%pB\n",
+ idx, lookup_num_blocks, (void *)ip);
+ return NULL;
+ }
+
+ start = orc_lookup[idx];
+ stop = orc_lookup[idx + 1] + 1;
+
+ if (unlikely((__start_orc_unwind + start >= __stop_orc_unwind) ||
+ (__start_orc_unwind + stop > __stop_orc_unwind))) {
+ orc_warn("WARNING: bad lookup value: idx=%u num=%u start=%u stop=%u ip=%pB\n",
+ idx, lookup_num_blocks, start, stop, (void *)ip);
+ return NULL;
+ }
+
+ return __orc_find(__start_orc_unwind_ip + start,
+ __start_orc_unwind + start, stop - start, ip);
+ }
+
+ /* vmlinux .init slow lookup: */
+ if (is_kernel_inittext(ip))
+ return __orc_find(__start_orc_unwind_ip, __start_orc_unwind,
+ __stop_orc_unwind_ip - __start_orc_unwind_ip, ip);
+
+ /* Module lookup: */
+ orc = orc_module_find(ip);
+ if (orc)
+ return orc;
+
+ return orc_ftrace_find(ip);
+}
+
+#ifdef CONFIG_MODULES
+
+static DEFINE_MUTEX(sort_mutex);
+static int *cur_orc_ip_table = __start_orc_unwind_ip;
+static struct orc_entry *cur_orc_table = __start_orc_unwind;
+
+static void orc_sort_swap(void *_a, void *_b, int size)
+{
+ struct orc_entry *orc_a, *orc_b;
+ int *a = _a, *b = _b, tmp;
+ int delta = _b - _a;
+
+ /* Swap the .orc_unwind_ip entries: */
+ tmp = *a;
+ *a = *b + delta;
+ *b = tmp - delta;
+
+ /* Swap the corresponding .orc_unwind entries: */
+ orc_a = cur_orc_table + (a - cur_orc_ip_table);
+ orc_b = cur_orc_table + (b - cur_orc_ip_table);
+ swap(*orc_a, *orc_b);
+}
+
+static int orc_sort_cmp(const void *_a, const void *_b)
+{
+ struct orc_entry *orc_a;
+ const int *a = _a, *b = _b;
+ unsigned long a_val = orc_ip(a);
+ unsigned long b_val = orc_ip(b);
+
+ if (a_val > b_val)
+ return 1;
+ if (a_val < b_val)
+ return -1;
+
+ /*
+ * The "weak" section terminator entries need to always be first
+ * to ensure the lookup code skips them in favor of real entries.
+ * These terminator entries exist to handle any gaps created by
+ * whitelisted .o files which didn't get objtool generation.
+ */
+ orc_a = cur_orc_table + (a - cur_orc_ip_table);
+ return orc_a->type == ORC_TYPE_UNDEFINED ? -1 : 1;
+}
+
+void unwind_module_init(struct module *mod, void *_orc_ip, size_t orc_ip_size,
+ void *_orc, size_t orc_size)
+{
+ int *orc_ip = _orc_ip;
+ struct orc_entry *orc = _orc;
+ unsigned int num_entries = orc_ip_size / sizeof(int);
+
+ WARN_ON_ONCE(orc_ip_size % sizeof(int) != 0 ||
+ orc_size % sizeof(*orc) != 0 ||
+ num_entries != orc_size / sizeof(*orc));
+
+ /*
+ * The 'cur_orc_*' globals allow the orc_sort_swap() callback to
+ * associate an .orc_unwind_ip table entry with its corresponding
+ * .orc_unwind entry so they can both be swapped.
+ */
+ mutex_lock(&sort_mutex);
+ cur_orc_ip_table = orc_ip;
+ cur_orc_table = orc;
+ sort(orc_ip, num_entries, sizeof(int), orc_sort_cmp, orc_sort_swap);
+ mutex_unlock(&sort_mutex);
+
+ mod->arch.orc_unwind_ip = orc_ip;
+ mod->arch.orc_unwind = orc;
+ mod->arch.num_orcs = num_entries;
+}
+#endif
+
+void __init unwind_init(void)
+{
+ size_t orc_ip_size = (void *)__stop_orc_unwind_ip - (void *)__start_orc_unwind_ip;
+ size_t orc_size = (void *)__stop_orc_unwind - (void *)__start_orc_unwind;
+ size_t num_entries = orc_ip_size / sizeof(int);
+ struct orc_entry *orc;
+ int i;
+
+ if (!num_entries || orc_ip_size % sizeof(int) != 0 ||
+ orc_size % sizeof(struct orc_entry) != 0 ||
+ num_entries != orc_size / sizeof(struct orc_entry)) {
+ orc_warn("WARNING: Bad or missing .orc_unwind table. Disabling unwinder.\n");
+ return;
+ }
+
+ /*
+ * Note, the orc_unwind and orc_unwind_ip tables were already
+ * sorted at build time via the 'sorttable' tool.
+ * It's ready for binary search straight away, no need to sort it.
+ */
+
+ /* Initialize the fast lookup table: */
+ lookup_num_blocks = orc_lookup_end - orc_lookup;
+ for (i = 0; i < lookup_num_blocks-1; i++) {
+ orc = __orc_find(__start_orc_unwind_ip, __start_orc_unwind,
+ num_entries,
+ LOOKUP_START_IP + (LOOKUP_BLOCK_SIZE * i));
+ if (!orc) {
+ orc_warn("WARNING: Corrupt .orc_unwind table. Disabling unwinder.\n");
+ return;
+ }
+
+ orc_lookup[i] = orc - __start_orc_unwind;
+ }
+
+ /* Initialize the ending block: */
+ orc = __orc_find(__start_orc_unwind_ip, __start_orc_unwind, num_entries,
+ LOOKUP_STOP_IP);
+ if (!orc) {
+ orc_warn("WARNING: Corrupt .orc_unwind table. Disabling unwinder.\n");
+ return;
+ }
+ orc_lookup[lookup_num_blocks-1] = orc - __start_orc_unwind;
+
+ orc_init = true;
+}
+
+static inline bool on_stack(struct stack_info *info, unsigned long addr, size_t len)
+{
+ unsigned long begin = info->begin;
+ unsigned long end = info->end;
+
+ return (info->type != STACK_TYPE_UNKNOWN &&
+ addr >= begin && addr < end &&
+ addr + len > begin && addr + len <= end);
+}
+
+static bool stack_access_ok(struct unwind_state *state, unsigned long addr,
+ size_t len)
+{
+ struct stack_info *info = &state->stack_info;
+
+ if (on_stack(info, addr, len))
+ return true;
+
+ return !get_stack_info(addr, state->task, info) &&
+ on_stack(info, addr, len);
+}
+
+unsigned long unwind_get_return_address(struct unwind_state *state)
+{
+ return __unwind_get_return_address(state);
+}
+EXPORT_SYMBOL_GPL(unwind_get_return_address);
+
+void unwind_start(struct unwind_state *state, struct task_struct *task,
+ struct pt_regs *regs)
+{
+ __unwind_start(state, task, regs);
+ if (!unwind_done(state) && !__kernel_text_address(state->pc))
+ unwind_next_frame(state);
+}
+EXPORT_SYMBOL_GPL(unwind_start);
+
+static bool is_entry_func(unsigned long addr)
+{
+ extern u32 kernel_entry;
+ extern u32 kernel_entry_end;
+
+ return addr >= (unsigned long)&kernel_entry &&
+ addr < (unsigned long)&kernel_entry_end;
+}
+
+static inline unsigned long bt_address(unsigned long ra)
+{
+ extern unsigned long eentry;
+
+ if (__kernel_text_address(ra))
+ return ra;
+
+ /* We are in preempt_disable() here */
+ if (__module_text_address(ra))
+ return ra;
+
+ if (ra >= eentry && ra < eentry + EXCCODE_INT_END * VECSIZE) {
+ unsigned long type = (ra - eentry) / VECSIZE;
+ unsigned long offset = (ra - eentry) % VECSIZE;
+ unsigned long func;
+
+ switch (type) {
+ case EXCCODE_TLBL:
+ case EXCCODE_TLBI:
+ func = (unsigned long)handle_tlb_load;
+ break;
+ case EXCCODE_TLBS:
+ func = (unsigned long)handle_tlb_store;
+ break;
+ case EXCCODE_TLBM:
+ func = (unsigned long)handle_tlb_modify;
+ break;
+ case EXCCODE_TLBNR:
+ case EXCCODE_TLBNX:
+ case EXCCODE_TLBPE:
+ func = (unsigned long)handle_tlb_protect;
+ break;
+ case EXCCODE_ADE:
+ func = (unsigned long)handle_ade;
+ break;
+ case EXCCODE_ALE:
+ func = (unsigned long)handle_ale;
+ break;
+ case EXCCODE_BCE:
+ func = (unsigned long)handle_bce;
+ break;
+ case EXCCODE_SYS:
+ func = (unsigned long)handle_sys;
+ break;
+ case EXCCODE_BP:
+ func = (unsigned long)handle_bp;
+ break;
+ case EXCCODE_INE:
+ case EXCCODE_IPE:
+ func = (unsigned long)handle_ri;
+ break;
+ case EXCCODE_FPDIS:
+ func = (unsigned long)handle_fpu;
+ break;
+ case EXCCODE_LSXDIS:
+ func = (unsigned long)handle_lsx;
+ break;
+ case EXCCODE_LASXDIS:
+ func = (unsigned long)handle_lasx;
+ break;
+ case EXCCODE_FPE:
+ func = (unsigned long)handle_fpe;
+ break;
+ case EXCCODE_WATCH:
+ func = (unsigned long)handle_watch;
+ break;
+ case EXCCODE_BTDIS:
+ func = (unsigned long)handle_lbt;
+ break;
+ case EXCCODE_INT_START ... EXCCODE_INT_END - 1:
+ func = (unsigned long)handle_vint;
+ break;
+ default:
+ func = (unsigned long)handle_reserved;
+ break;
+ }
+
+ return func + offset;
+ }
+
+ return ra;
+}
+
+bool unwind_next_frame(struct unwind_state *state)
+{
+ struct stack_info *info = &state->stack_info;
+ struct orc_entry *orc;
+ struct pt_regs *regs;
+ unsigned long *p, pc;
+
+ if (unwind_done(state))
+ return false;
+
+ /* Don't let modules unload while we're reading their ORC data. */
+ preempt_disable();
+
+ if (is_entry_func(state->pc))
+ goto end;
+
+ orc = orc_find(state->pc);
+ if (!orc) {
+ orc = &orc_fp_entry;
+ state->error = true;
+ }
+
+ switch (orc->sp_reg) {
+ case ORC_REG_SP:
+ state->sp = state->sp + orc->sp_offset;
+ break;
+ case ORC_REG_BP:
+ state->sp = state->fp;
+ break;
+ default:
+ orc_warn("unknown SP base reg %d at %pB\n",
+ orc->sp_reg, (void *)state->pc);
+ goto err;
+ }
+
+ switch (orc->bp_reg) {
+ case ORC_REG_PREV_SP:
+ p = (unsigned long *)(state->sp + orc->bp_offset);
+ if (!stack_access_ok(state, (unsigned long)p, sizeof(unsigned long)))
+ goto err;
+
+ state->fp = *p;
+ break;
+ case ORC_REG_UNDEFINED:
+ /* Nothing. */
+ break;
+ default:
+ orc_warn("unknown FP base reg %d at %pB\n",
+ orc->bp_reg, (void *)state->pc);
+ goto err;
+ }
+
+ switch (orc->type) {
+ case UNWIND_HINT_TYPE_CALL:
+ if (orc->ra_reg == ORC_REG_PREV_SP) {
+ p = (unsigned long *)(state->sp + orc->ra_offset);
+ if (!stack_access_ok(state, (unsigned long)p, sizeof(unsigned long)))
+ goto err;
+
+ pc = unwind_graph_addr(state, *p, state->sp);
+ pc -= LOONGARCH_INSN_SIZE;
+ } else if (orc->ra_reg == ORC_REG_UNDEFINED) {
+ if (!state->ra || state->ra == state->pc)
+ goto err;
+
+ pc = unwind_graph_addr(state, state->ra, state->sp);
+ pc -= LOONGARCH_INSN_SIZE;
+ state->ra = 0;
+ } else {
+ orc_warn("unknown ra base reg %d at %pB\n",
+ orc->ra_reg, (void *)state->pc);
+ goto err;
+ }
+ break;
+ case UNWIND_HINT_TYPE_REGS:
+ if (state->stack_info.type == STACK_TYPE_IRQ && state->sp == info->end)
+ regs = (struct pt_regs *)info->next_sp;
+ else
+ regs = (struct pt_regs *)state->sp;
+
+ if (!stack_access_ok(state, (unsigned long)regs, sizeof(*regs)))
+ goto err;
+
+ if ((info->end == (unsigned long)regs + sizeof(*regs)) &&
+ !regs->regs[3] && !regs->regs[1])
+ goto end;
+
+ if (user_mode(regs))
+ goto end;
+
+ pc = regs->csr_era;
+ if (!__kernel_text_address(pc))
+ goto err;
+
+ state->sp = regs->regs[3];
+ state->ra = regs->regs[1];
+ state->fp = regs->regs[22];
+ get_stack_info(state->sp, state->task, info);
+
+ break;
+ default:
+ orc_warn("unknown .orc_unwind entry type %d at %pB\n",
+ orc->type, (void *)state->pc);
+ goto err;
+ }
+
+ state->pc = bt_address(pc);
+ if (!state->pc) {
+ pr_err("cannot find unwind pc at %pK\n", (void *)pc);
+ goto err;
+ }
+
+ if (!__kernel_text_address(state->pc))
+ goto err;
+
+ preempt_enable();
+ return true;
+
+err:
+ state->error = true;
+
+end:
+ preempt_enable();
+ state->stack_info.type = STACK_TYPE_UNKNOWN;
+ return false;
+}
+EXPORT_SYMBOL_GPL(unwind_next_frame);
diff --git a/arch/loongarch/kernel/vmlinux.lds.S b/arch/loongarch/kernel/vmlinux.lds.S
index bb2ec86..09fd4eb 100644
--- a/arch/loongarch/kernel/vmlinux.lds.S
+++ b/arch/loongarch/kernel/vmlinux.lds.S
@@ -2,6 +2,7 @@
#include <linux/sizes.h>
#include <asm/asm-offsets.h>
#include <asm/thread_info.h>
+#include <asm/orc_lookup.h>

#define PAGE_SIZE _PAGE_SIZE
#define RO_EXCEPTION_TABLE_ALIGN 4
@@ -99,6 +100,8 @@ SECTIONS
_sdata = .;
RO_DATA(4096)

+ ORC_UNWIND_TABLE
+
.got : ALIGN(16) { *(.got) }
.plt : ALIGN(16) { *(.plt) }
.got.plt : ALIGN(16) { *(.got.plt) }
diff --git a/arch/loongarch/lib/Makefile b/arch/loongarch/lib/Makefile
index a77bf160..e3023d9 100644
--- a/arch/loongarch/lib/Makefile
+++ b/arch/loongarch/lib/Makefile
@@ -3,6 +3,8 @@
# Makefile for LoongArch-specific library files.
#

+OBJECT_FILES_NON_STANDARD := y
+
lib-y += delay.o memset.o memcpy.o memmove.o \
clear_user.o copy_user.o csum.o dump_tlb.o unaligned.o

diff --git a/arch/loongarch/mm/tlbex.S b/arch/loongarch/mm/tlbex.S
index ca17dd3..a44387b 100644
--- a/arch/loongarch/mm/tlbex.S
+++ b/arch/loongarch/mm/tlbex.S
@@ -17,7 +17,8 @@
#define PTRS_PER_PTE_BITS (PAGE_SHIFT - 3)

.macro tlb_do_page_fault, write
- SYM_FUNC_START(tlb_do_page_fault_\write)
+ SYM_CODE_START(tlb_do_page_fault_\write)
+ UNWIND_HINT_UNDEFINED
SAVE_ALL
csrrd a2, LOONGARCH_CSR_BADV
move a0, sp
@@ -25,13 +26,14 @@
li.w a1, \write
bl do_page_fault
RESTORE_ALL_AND_RET
- SYM_FUNC_END(tlb_do_page_fault_\write)
+ SYM_CODE_END(tlb_do_page_fault_\write)
.endm

tlb_do_page_fault 0
tlb_do_page_fault 1

-SYM_FUNC_START(handle_tlb_protect)
+SYM_CODE_START(handle_tlb_protect)
+ UNWIND_HINT_UNDEFINED
BACKUP_T0T1
SAVE_ALL
move a0, sp
@@ -41,9 +43,10 @@ SYM_FUNC_START(handle_tlb_protect)
la_abs t0, do_page_fault
jirl ra, t0, 0
RESTORE_ALL_AND_RET
-SYM_FUNC_END(handle_tlb_protect)
+SYM_CODE_END(handle_tlb_protect)

-SYM_FUNC_START(handle_tlb_load)
+SYM_CODE_START(handle_tlb_load)
+ UNWIND_HINT_UNDEFINED
csrwr t0, EXCEPTION_KS0
csrwr t1, EXCEPTION_KS1
csrwr ra, EXCEPTION_KS2
@@ -187,16 +190,18 @@ nopage_tlb_load:
csrrd ra, EXCEPTION_KS2
la_abs t0, tlb_do_page_fault_0
jr t0
-SYM_FUNC_END(handle_tlb_load)
+SYM_CODE_END(handle_tlb_load)

-SYM_FUNC_START(handle_tlb_load_ptw)
+SYM_CODE_START(handle_tlb_load_ptw)
+ UNWIND_HINT_UNDEFINED
csrwr t0, LOONGARCH_CSR_KS0
csrwr t1, LOONGARCH_CSR_KS1
la_abs t0, tlb_do_page_fault_0
jr t0
-SYM_FUNC_END(handle_tlb_load_ptw)
+SYM_CODE_END(handle_tlb_load_ptw)

-SYM_FUNC_START(handle_tlb_store)
+SYM_CODE_START(handle_tlb_store)
+ UNWIND_HINT_UNDEFINED
csrwr t0, EXCEPTION_KS0
csrwr t1, EXCEPTION_KS1
csrwr ra, EXCEPTION_KS2
@@ -343,16 +348,18 @@ nopage_tlb_store:
csrrd ra, EXCEPTION_KS2
la_abs t0, tlb_do_page_fault_1
jr t0
-SYM_FUNC_END(handle_tlb_store)
+SYM_CODE_END(handle_tlb_store)

-SYM_FUNC_START(handle_tlb_store_ptw)
+SYM_CODE_START(handle_tlb_store_ptw)
+ UNWIND_HINT_UNDEFINED
csrwr t0, LOONGARCH_CSR_KS0
csrwr t1, LOONGARCH_CSR_KS1
la_abs t0, tlb_do_page_fault_1
jr t0
-SYM_FUNC_END(handle_tlb_store_ptw)
+SYM_CODE_END(handle_tlb_store_ptw)

-SYM_FUNC_START(handle_tlb_modify)
+SYM_CODE_START(handle_tlb_modify)
+ UNWIND_HINT_UNDEFINED
csrwr t0, EXCEPTION_KS0
csrwr t1, EXCEPTION_KS1
csrwr ra, EXCEPTION_KS2
@@ -497,16 +504,18 @@ nopage_tlb_modify:
csrrd ra, EXCEPTION_KS2
la_abs t0, tlb_do_page_fault_1
jr t0
-SYM_FUNC_END(handle_tlb_modify)
+SYM_CODE_END(handle_tlb_modify)

-SYM_FUNC_START(handle_tlb_modify_ptw)
+SYM_CODE_START(handle_tlb_modify_ptw)
+ UNWIND_HINT_UNDEFINED
csrwr t0, LOONGARCH_CSR_KS0
csrwr t1, LOONGARCH_CSR_KS1
la_abs t0, tlb_do_page_fault_1
jr t0
-SYM_FUNC_END(handle_tlb_modify_ptw)
+SYM_CODE_END(handle_tlb_modify_ptw)

-SYM_FUNC_START(handle_tlb_refill)
+SYM_CODE_START(handle_tlb_refill)
+ UNWIND_HINT_UNDEFINED
csrwr t0, LOONGARCH_CSR_TLBRSAVE
csrrd t0, LOONGARCH_CSR_PGD
lddir t0, t0, 3
@@ -521,4 +530,4 @@ SYM_FUNC_START(handle_tlb_refill)
tlbfill
csrrd t0, LOONGARCH_CSR_TLBRSAVE
ertn
-SYM_FUNC_END(handle_tlb_refill)
+SYM_CODE_END(handle_tlb_refill)
diff --git a/arch/loongarch/power/Makefile b/arch/loongarch/power/Makefile
index 58151d0..bbd1d47 100644
--- a/arch/loongarch/power/Makefile
+++ b/arch/loongarch/power/Makefile
@@ -1,3 +1,5 @@
+OBJECT_FILES_NON_STANDARD_suspend_asm.o := y
+
obj-y += platform.o

obj-$(CONFIG_SUSPEND) += suspend.o suspend_asm.o
diff --git a/arch/loongarch/vdso/Makefile b/arch/loongarch/vdso/Makefile
index 5c97d1463..997f41c 100644
--- a/arch/loongarch/vdso/Makefile
+++ b/arch/loongarch/vdso/Makefile
@@ -3,6 +3,7 @@

KASAN_SANITIZE := n
KCOV_INSTRUMENT := n
+OBJECT_FILES_NON_STANDARD := y

# Include the generic Makefile to check the built vdso.
include $(srctree)/lib/vdso/Makefile
diff --git a/include/linux/compiler.h b/include/linux/compiler.h
index d7779a1..df29ddb 100644
--- a/include/linux/compiler.h
+++ b/include/linux/compiler.h
@@ -116,6 +116,14 @@ void ftrace_likely_update(struct ftrace_likely_data *f, int val,
*/
#define __stringify_label(n) #n

+#define __annotate_reachable(c) ({ \
+ asm volatile(__stringify_label(c) ":\n\t" \
+ ".pushsection .discard.reachable\n\t" \
+ ".long " __stringify_label(c) "b - .\n\t" \
+ ".popsection\n\t"); \
+})
+#define annotate_reachable() __annotate_reachable(__COUNTER__)
+
#define __annotate_unreachable(c) ({ \
asm volatile(__stringify_label(c) ":\n\t" \
".pushsection .discard.unreachable\n\t" \
@@ -128,6 +136,7 @@ void ftrace_likely_update(struct ftrace_likely_data *f, int val,
#define __annotate_jump_table __section(".rodata..c_jump_table")

#else /* !CONFIG_OBJTOOL */
+#define annotate_reachable()
#define annotate_unreachable()
#define __annotate_jump_table
#endif /* CONFIG_OBJTOOL */
diff --git a/scripts/Makefile b/scripts/Makefile
index 576cf64..baaed78 100644
--- a/scripts/Makefile
+++ b/scripts/Makefile
@@ -33,7 +33,10 @@ ifdef CONFIG_UNWINDER_ORC
ifeq ($(ARCH),x86_64)
ARCH := x86
endif
-HOSTCFLAGS_sorttable.o += -I$(srctree)/tools/arch/x86/include
+ifeq ($(ARCH),loongarch)
+ARCH := loongarch
+endif
+HOSTCFLAGS_sorttable.o += -I$(srctree)/tools/arch/$(ARCH)/include
HOSTCFLAGS_sorttable.o += -DUNWINDER_ORC_ENABLED
endif

--
2.1.0

2023-10-10 12:46:34

by Huacai Chen

[permalink] [raw]
Subject: Re: [PATCH v2 1/8] objtool/LoongArch: Enable objtool to be built

Hi, Tiezhu,

On Mon, Oct 9, 2023 at 9:03 PM Tiezhu Yang <[email protected]> wrote:
>
> Add the minimal changes to enable objtool build on LoongArch,
> most of the functions are stubs to only fix the build errors
> when make -C tools/objtool.
>
> This is similar with commit e52ec98c5ab1 ("objtool/powerpc:
> Enable objtool to be built on ppc").
>
> Co-developed-by: Jinyang He <[email protected]>
> Signed-off-by: Jinyang He <[email protected]>
> Co-developed-by: Youling Tang <[email protected]>
> Signed-off-by: Youling Tang <[email protected]>
> Signed-off-by: Tiezhu Yang <[email protected]>
> ---
> tools/objtool/arch/loongarch/Build | 2 +
> tools/objtool/arch/loongarch/decode.c | 71 ++++++++++++++++++++++
> .../objtool/arch/loongarch/include/arch/cfi_regs.h | 21 +++++++
> tools/objtool/arch/loongarch/include/arch/elf.h | 30 +++++++++
> .../objtool/arch/loongarch/include/arch/special.h | 33 ++++++++++
> tools/objtool/arch/loongarch/special.c | 15 +++++
> 6 files changed, 172 insertions(+)
> create mode 100644 tools/objtool/arch/loongarch/Build
> create mode 100644 tools/objtool/arch/loongarch/decode.c
> create mode 100644 tools/objtool/arch/loongarch/include/arch/cfi_regs.h
> create mode 100644 tools/objtool/arch/loongarch/include/arch/elf.h
> create mode 100644 tools/objtool/arch/loongarch/include/arch/special.h
> create mode 100644 tools/objtool/arch/loongarch/special.c
>
> diff --git a/tools/objtool/arch/loongarch/Build b/tools/objtool/arch/loongarch/Build
> new file mode 100644
> index 0000000..d24d563
> --- /dev/null
> +++ b/tools/objtool/arch/loongarch/Build
> @@ -0,0 +1,2 @@
> +objtool-y += decode.o
> +objtool-y += special.o
> diff --git a/tools/objtool/arch/loongarch/decode.c b/tools/objtool/arch/loongarch/decode.c
> new file mode 100644
> index 0000000..cc74ba4
> --- /dev/null
> +++ b/tools/objtool/arch/loongarch/decode.c
> @@ -0,0 +1,71 @@
> +// SPDX-License-Identifier: GPL-2.0-or-later
> +#include <string.h>
> +#include <objtool/check.h>
> +
> +int arch_ftrace_match(char *name)
> +{
> + return !strcmp(name, "_mcount");
> +}
> +
> +unsigned long arch_jump_destination(struct instruction *insn)
> +{
> + return insn->offset + (insn->immediate << 2);
> +}
> +
> +unsigned long arch_dest_reloc_offset(int addend)
> +{
> + return addend;
> +}
> +
> +bool arch_pc_relative_reloc(struct reloc *reloc)
> +{
> + return false;
> +}
> +
> +bool arch_callee_saved_reg(unsigned char reg)
> +{
> + switch (reg) {
> + case CFI_RA:
> + case CFI_FP:
> + case CFI_S0 ... CFI_S8:
> + return true;
> + default:
> + return false;
> + }
> +}
> +
> +int arch_decode_hint_reg(u8 sp_reg, int *base)
> +{
> + return 0;
> +}
> +
> +int arch_decode_instruction(struct objtool_file *file, const struct section *sec,
> + unsigned long offset, unsigned int maxlen,
> + struct instruction *insn)
> +{
> + return 0;
> +}
> +
> +const char *arch_nop_insn(int len)
> +{
> + return NULL;
> +}
> +
> +const char *arch_ret_insn(int len)
> +{
> + return NULL;
> +}
> +
> +void arch_initial_func_cfi_state(struct cfi_init_state *state)
> +{
> + int i;
> +
> + for (i = 0; i < CFI_NUM_REGS; i++) {
> + state->regs[i].base = CFI_UNDEFINED;
> + state->regs[i].offset = 0;
> + }
> +
> + /* initial CFA (call frame address) */
> + state->cfa.base = CFI_SP;
> + state->cfa.offset = 0;
> +}
> diff --git a/tools/objtool/arch/loongarch/include/arch/cfi_regs.h b/tools/objtool/arch/loongarch/include/arch/cfi_regs.h
> new file mode 100644
> index 0000000..c768d39
> --- /dev/null
> +++ b/tools/objtool/arch/loongarch/include/arch/cfi_regs.h
> @@ -0,0 +1,21 @@
> +/* SPDX-License-Identifier: GPL-2.0-or-later */
> +#ifndef _OBJTOOL_ARCH_CFI_REGS_H
> +#define _OBJTOOL_ARCH_CFI_REGS_H
> +
> +#define CFI_RA 1
> +#define CFI_SP 3
> +#define CFI_FP 22
> +#define CFI_S0 23
> +#define CFI_S1 24
> +#define CFI_S2 25
> +#define CFI_S3 26
> +#define CFI_S4 27
> +#define CFI_S5 28
> +#define CFI_S6 29
> +#define CFI_S7 30
> +#define CFI_S8 31
> +#define CFI_NUM_REGS 32
> +
> +#define CFI_BP CFI_FP
> +
> +#endif /* _OBJTOOL_ARCH_CFI_REGS_H */
> diff --git a/tools/objtool/arch/loongarch/include/arch/elf.h b/tools/objtool/arch/loongarch/include/arch/elf.h
> new file mode 100644
> index 0000000..9623d66
> --- /dev/null
> +++ b/tools/objtool/arch/loongarch/include/arch/elf.h
> @@ -0,0 +1,30 @@
> +/* SPDX-License-Identifier: GPL-2.0-or-later */
> +#ifndef _OBJTOOL_ARCH_ELF_H
> +#define _OBJTOOL_ARCH_ELF_H
> +
> +/*
> + * See the following link for more info about ELF Relocation types:
> + * https://loongson.github.io/LoongArch-Documentation/LoongArch-ELF-ABI-EN.html#_relocations
> + */
> +#ifndef R_LARCH_NONE
> +#define R_LARCH_NONE 0
> +#endif
> +#ifndef R_LARCH_32
> +#define R_LARCH_32 1
> +#endif
> +#ifndef R_LARCH_64
> +#define R_LARCH_64 2
> +#endif
> +#ifndef R_LARCH_32_PCREL
> +#define R_LARCH_32_PCREL 99
> +#endif
> +
> +#define R_NONE R_LARCH_NONE
> +#define R_ABS32 R_LARCH_32
> +#define R_ABS64 R_LARCH_64
> +#define R_DATA32 R_LARCH_32_PCREL
> +#define R_DATA64 R_LARCH_32_PCREL
> +#define R_TEXT32 R_LARCH_32_PCREL
> +#define R_TEXT64 R_LARCH_32_PCREL
> +
> +#endif /* _OBJTOOL_ARCH_ELF_H */
> diff --git a/tools/objtool/arch/loongarch/include/arch/special.h b/tools/objtool/arch/loongarch/include/arch/special.h
> new file mode 100644
> index 0000000..1a8245c
> --- /dev/null
> +++ b/tools/objtool/arch/loongarch/include/arch/special.h
> @@ -0,0 +1,33 @@
> +/* SPDX-License-Identifier: GPL-2.0-or-later */
> +#ifndef _OBJTOOL_ARCH_SPECIAL_H
> +#define _OBJTOOL_ARCH_SPECIAL_H
> +
> +/*
> + * See more info about struct exception_table_entry
> + * in arch/loongarch/include/asm/extable.h
> + */
> +#define EX_ENTRY_SIZE 12
> +#define EX_ORIG_OFFSET 0
> +#define EX_NEW_OFFSET 4
Other archs use tab for indentation in special.h

Huacai
> +
> +/*
> + * See more info about struct jump_entry
> + * in include/linux/jump_label.h
> + */
> +#define JUMP_ENTRY_SIZE 16
> +#define JUMP_ORIG_OFFSET 0
> +#define JUMP_NEW_OFFSET 4
> +#define JUMP_KEY_OFFSET 8
> +
> +/*
> + * See more info about struct alt_instr
> + * in arch/loongarch/include/asm/alternative.h
> + */
> +#define ALT_ENTRY_SIZE 12
> +#define ALT_ORIG_OFFSET 0
> +#define ALT_NEW_OFFSET 4
> +#define ALT_FEATURE_OFFSET 8
> +#define ALT_ORIG_LEN_OFFSET 10
> +#define ALT_NEW_LEN_OFFSET 11
> +
> +#endif /* _OBJTOOL_ARCH_SPECIAL_H */
> diff --git a/tools/objtool/arch/loongarch/special.c b/tools/objtool/arch/loongarch/special.c
> new file mode 100644
> index 0000000..9bba1e9
> --- /dev/null
> +++ b/tools/objtool/arch/loongarch/special.c
> @@ -0,0 +1,15 @@
> +// SPDX-License-Identifier: GPL-2.0-or-later
> +#include <objtool/special.h>
> +
> +bool arch_support_alt_relocation(struct special_alt *special_alt,
> + struct instruction *insn,
> + struct reloc *reloc)
> +{
> + return false;
> +}
> +
> +struct reloc *arch_find_switch_table(struct objtool_file *file,
> + struct instruction *insn)
> +{
> + return NULL;
> +}
> --
> 2.1.0
>

2023-10-10 12:53:18

by Huacai Chen

[permalink] [raw]
Subject: Re: [PATCH v2 4/8] objtool/LoongArch: Enable orc to be built

Hi, Tiezhu,

On Mon, Oct 9, 2023 at 9:03 PM Tiezhu Yang <[email protected]> wrote:
>
> Implement arch-specific init_orc_entry(), reg_name(), orc_type_name(),
> print_reg() and orc_print_dump(), then set BUILD_ORC as y to build the
> orc related files.
>
> Co-developed-by: Jinyang He <[email protected]>
> Signed-off-by: Jinyang He <[email protected]>
> Co-developed-by: Youling Tang <[email protected]>
> Signed-off-by: Youling Tang <[email protected]>
> Signed-off-by: Tiezhu Yang <[email protected]>
> ---
> tools/arch/loongarch/include/asm/orc_types.h | 58 ++++++++++
> tools/objtool/Makefile | 4 +
> tools/objtool/arch/loongarch/Build | 1 +
> tools/objtool/arch/loongarch/decode.c | 16 +++
> tools/objtool/arch/loongarch/orc.c | 155 +++++++++++++++++++++++++++
> 5 files changed, 234 insertions(+)
> create mode 100644 tools/arch/loongarch/include/asm/orc_types.h
> create mode 100644 tools/objtool/arch/loongarch/orc.c
>
> diff --git a/tools/arch/loongarch/include/asm/orc_types.h b/tools/arch/loongarch/include/asm/orc_types.h
> new file mode 100644
> index 0000000..1d37e62
> --- /dev/null
> +++ b/tools/arch/loongarch/include/asm/orc_types.h
> @@ -0,0 +1,58 @@
> +/* SPDX-License-Identifier: GPL-2.0-or-later */
> +#ifndef _ORC_TYPES_H
> +#define _ORC_TYPES_H
> +
> +#include <linux/types.h>
> +
> +/*
> + * The ORC_REG_* registers are base registers which are used to find other
> + * registers on the stack.
> + *
> + * ORC_REG_PREV_SP, also known as DWARF Call Frame Address (CFA), is the
> + * address of the previous frame: the caller's SP before it called the current
> + * function.
> + *
> + * ORC_REG_UNDEFINED means the corresponding register's value didn't change in
> + * the current frame.
> + *
> + * The most commonly used base registers are SP and BP -- which the previous SP
> + * is usually based on -- and PREV_SP and UNDEFINED -- which the previous BP is
> + * usually based on.
> + *
> + * The rest of the base registers are needed for special cases like entry code
> + * and GCC realigned stacks.
> + */
> +#define ORC_REG_UNDEFINED 0
> +#define ORC_REG_PREV_SP 1
> +#define ORC_REG_SP 2
> +#define ORC_REG_BP 3
There is no BP register for LoongArch, so I think all 'BP' should be
'FP' in this patch.

Huacai

> +#define ORC_REG_MAX 4
> +
> +#define ORC_TYPE_UNDEFINED 0
> +#define ORC_TYPE_END_OF_STACK 1
> +#define ORC_TYPE_CALL 2
> +#define ORC_TYPE_REGS 3
> +#define ORC_TYPE_REGS_PARTIAL 4
> +
> +#ifndef __ASSEMBLY__
> +/*
> + * This struct is more or less a vastly simplified version of the DWARF Call
> + * Frame Information standard. It contains only the necessary parts of DWARF
> + * CFI, simplified for ease of access by the in-kernel unwinder. It tells the
> + * unwinder how to find the previous SP and BP (and sometimes entry regs) on
> + * the stack for a given code address. Each instance of the struct corresponds
> + * to one or more code locations.
> + */
> +struct orc_entry {
> + s16 sp_offset;
> + s16 bp_offset;
> + s16 ra_offset;
> + unsigned int sp_reg:4;
> + unsigned int bp_reg:4;
> + unsigned int ra_reg:4;
> + unsigned int type:3;
> + unsigned int signal:1;
> +};
> +#endif /* __ASSEMBLY__ */
> +
> +#endif /* _ORC_TYPES_H */
> diff --git a/tools/objtool/Makefile b/tools/objtool/Makefile
> index 83b100c..bf7f7f8 100644
> --- a/tools/objtool/Makefile
> +++ b/tools/objtool/Makefile
> @@ -57,6 +57,10 @@ ifeq ($(SRCARCH),x86)
> BUILD_ORC := y
> endif
>
> +ifeq ($(SRCARCH),loongarch)
> + BUILD_ORC := y
> +endif
> +
> export BUILD_ORC
> export srctree OUTPUT CFLAGS SRCARCH AWK
> include $(srctree)/tools/build/Makefile.include
> diff --git a/tools/objtool/arch/loongarch/Build b/tools/objtool/arch/loongarch/Build
> index d24d563..1d4b784 100644
> --- a/tools/objtool/arch/loongarch/Build
> +++ b/tools/objtool/arch/loongarch/Build
> @@ -1,2 +1,3 @@
> objtool-y += decode.o
> objtool-y += special.o
> +objtool-y += orc.o
> diff --git a/tools/objtool/arch/loongarch/decode.c b/tools/objtool/arch/loongarch/decode.c
> index 3a426e4..1c96759 100644
> --- a/tools/objtool/arch/loongarch/decode.c
> +++ b/tools/objtool/arch/loongarch/decode.c
> @@ -3,6 +3,8 @@
> #include <objtool/check.h>
> #include <objtool/warn.h>
> #include <asm/inst.h>
> +#include <asm/orc_types.h>
> +#include <linux/objtool_types.h>
>
> int arch_ftrace_match(char *name)
> {
> @@ -38,6 +40,20 @@ bool arch_callee_saved_reg(unsigned char reg)
>
> int arch_decode_hint_reg(u8 sp_reg, int *base)
> {
> + switch (sp_reg) {
> + case ORC_REG_UNDEFINED:
> + *base = CFI_UNDEFINED;
> + break;
> + case ORC_REG_SP:
> + *base = CFI_SP;
> + break;
> + case ORC_REG_BP:
> + *base = CFI_FP;
> + break;
> + default:
> + return -1;
> + }
> +
> return 0;
> }
>
> diff --git a/tools/objtool/arch/loongarch/orc.c b/tools/objtool/arch/loongarch/orc.c
> new file mode 100644
> index 0000000..7d7ecee
> --- /dev/null
> +++ b/tools/objtool/arch/loongarch/orc.c
> @@ -0,0 +1,155 @@
> +// SPDX-License-Identifier: GPL-2.0-or-later
> +#include <linux/objtool_types.h>
> +#include <asm/orc_types.h>
> +
> +#include <objtool/check.h>
> +#include <objtool/orc.h>
> +#include <objtool/warn.h>
> +#include <objtool/endianness.h>
> +
> +int init_orc_entry(struct orc_entry *orc, struct cfi_state *cfi, struct instruction *insn)
> +{
> + struct cfi_reg *bp = &cfi->regs[CFI_BP];
> + struct cfi_reg *ra = &cfi->regs[CFI_RA];
> +
> + memset(orc, 0, sizeof(*orc));
> +
> + if (!cfi) {
> + /*
> + * This is usually either unreachable nops/traps (which don't
> + * trigger unreachable instruction warnings), or
> + * STACK_FRAME_NON_STANDARD functions.
> + */
> + orc->type = ORC_TYPE_UNDEFINED;
> + return 0;
> + }
> +
> + switch (cfi->type) {
> + case UNWIND_HINT_TYPE_UNDEFINED:
> + orc->type = ORC_TYPE_UNDEFINED;
> + return 0;
> + case UNWIND_HINT_TYPE_END_OF_STACK:
> + orc->type = ORC_TYPE_END_OF_STACK;
> + return 0;
> + case UNWIND_HINT_TYPE_CALL:
> + orc->type = ORC_TYPE_CALL;
> + break;
> + case UNWIND_HINT_TYPE_REGS:
> + orc->type = ORC_TYPE_REGS;
> + break;
> + case UNWIND_HINT_TYPE_REGS_PARTIAL:
> + orc->type = ORC_TYPE_REGS_PARTIAL;
> + break;
> + default:
> + WARN_INSN(insn, "unknown unwind hint type %d", cfi->type);
> + return -1;
> + }
> +
> + orc->signal = cfi->signal;
> +
> + switch (cfi->cfa.base) {
> + case CFI_SP:
> + orc->sp_reg = ORC_REG_SP;
> + break;
> + case CFI_BP:
> + orc->sp_reg = ORC_REG_BP;
> + break;
> + default:
> + WARN_INSN(insn, "unknown CFA base reg %d", cfi->cfa.base);
> + return -1;
> + }
> +
> + switch (bp->base) {
> + case CFI_UNDEFINED:
> + orc->bp_reg = ORC_REG_UNDEFINED;
> + orc->bp_offset = 0;
> + break;
> + case CFI_CFA:
> + orc->bp_reg = ORC_REG_PREV_SP;
> + orc->bp_offset = bp->offset;
> + break;
> + case CFI_BP:
> + orc->bp_reg = ORC_REG_BP;
> + break;
> + default:
> + WARN_INSN(insn, "unknown BP base reg %d", bp->base);
> + return -1;
> + }
> +
> + switch (ra->base) {
> + case CFI_UNDEFINED:
> + orc->ra_reg = ORC_REG_UNDEFINED;
> + orc->ra_offset = 0;
> + break;
> + case CFI_CFA:
> + orc->ra_reg = ORC_REG_PREV_SP;
> + orc->ra_offset = ra->offset;
> + break;
> + case CFI_BP:
> + orc->ra_reg = ORC_REG_BP;
> + break;
> + default:
> + WARN_INSN(insn, "unknown RA base reg %d", ra->base);
> + return -1;
> + }
> +
> + orc->sp_offset = cfi->cfa.offset;
> +
> + return 0;
> +}
> +
> +static const char *reg_name(unsigned int reg)
> +{
> + switch (reg) {
> + case ORC_REG_SP:
> + return "sp";
> + case ORC_REG_BP:
> + return "fp";
> + case ORC_REG_PREV_SP:
> + return "prevsp";
> + default:
> + return "?";
> + }
> +}
> +
> +static const char *orc_type_name(unsigned int type)
> +{
> + switch (type) {
> + case UNWIND_HINT_TYPE_CALL:
> + return "call";
> + case UNWIND_HINT_TYPE_REGS:
> + return "regs";
> + case UNWIND_HINT_TYPE_REGS_PARTIAL:
> + return "regs (partial)";
> + default:
> + return "?";
> + }
> +}
> +
> +static void print_reg(unsigned int reg, int offset)
> +{
> + if (reg == ORC_REG_UNDEFINED)
> + printf(" (und) ");
> + else
> + printf("%s + %3d", reg_name(reg), offset);
> +
> +}
> +
> +void orc_print_dump(struct elf *dummy_elf, struct orc_entry *orc, int i)
> +{
> + printf("type:%s", orc_type_name(orc[i].type));
> +
> + printf(" sp:");
> +
> + print_reg(orc[i].sp_reg, bswap_if_needed(dummy_elf, orc[i].sp_offset));
> +
> + printf(" bp:");
> +
> + print_reg(orc[i].bp_reg, bswap_if_needed(dummy_elf, orc[i].bp_offset));
> +
> + printf(" ra:");
> +
> + print_reg(orc[i].ra_reg, bswap_if_needed(dummy_elf, orc[i].ra_offset));
> +
> + printf(" signal:%d\n", orc[i].signal);
> +}
> --
> 2.1.0
>

2023-10-10 13:48:03

by kernel test robot

[permalink] [raw]
Subject: Re: [PATCH v2 8/8] LoongArch: Add ORC unwinder support

Hi Tiezhu,

kernel test robot noticed the following build errors:

[auto build test ERROR on masahiroy-kbuild/for-next]
[also build test ERROR on masahiroy-kbuild/fixes linus/master v6.6-rc5 next-20231010]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url: https://github.com/intel-lab-lkp/linux/commits/Tiezhu-Yang/objtool-LoongArch-Enable-objtool-to-be-built/20231009-210700
base: https://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild.git for-next
patch link: https://lore.kernel.org/r/1696856590-30298-9-git-send-email-yangtiezhu%40loongson.cn
patch subject: [PATCH v2 8/8] LoongArch: Add ORC unwinder support
config: x86_64-rhel-8.3-rust (https://download.01.org/0day-ci/archive/20231010/[email protected]/config)
compiler: clang version 16.0.4 (https://github.com/llvm/llvm-project.git ae42196bc493ffe877a7e3dff8be32035dea4d07)
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20231010/[email protected]/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <[email protected]>
| Closes: https://lore.kernel.org/oe-kbuild-all/[email protected]/

All errors (new ones prefixed by >>):

In file included from scripts/sorttable.c:201:
>> scripts/sorttable.h:96:10: fatal error: 'asm/orc_types.h' file not found
#include <asm/orc_types.h>
^~~~~~~~~~~~~~~~~
1 error generated.


vim +96 scripts/sorttable.h

a79f248b9b309e scripts/sortextable.h David Daney 2012-04-19 93
57fa1899428538 scripts/sorttable.h Shile Zhang 2019-12-04 94 #if defined(SORTTABLE_64) && defined(UNWINDER_ORC_ENABLED)
57fa1899428538 scripts/sorttable.h Shile Zhang 2019-12-04 95 /* ORC unwinder only support X86_64 */
57fa1899428538 scripts/sorttable.h Shile Zhang 2019-12-04 @96 #include <asm/orc_types.h>
57fa1899428538 scripts/sorttable.h Shile Zhang 2019-12-04 97

--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

2023-10-10 14:43:50

by kernel test robot

[permalink] [raw]
Subject: Re: [PATCH v2 8/8] LoongArch: Add ORC unwinder support

Hi Tiezhu,

kernel test robot noticed the following build errors:

[auto build test ERROR on masahiroy-kbuild/for-next]
[also build test ERROR on masahiroy-kbuild/fixes linus/master v6.6-rc5 next-20231010]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url: https://github.com/intel-lab-lkp/linux/commits/Tiezhu-Yang/objtool-LoongArch-Enable-objtool-to-be-built/20231009-210700
base: https://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild.git for-next
patch link: https://lore.kernel.org/r/1696856590-30298-9-git-send-email-yangtiezhu%40loongson.cn
patch subject: [PATCH v2 8/8] LoongArch: Add ORC unwinder support
config: x86_64-randconfig-014-20231010 (https://download.01.org/0day-ci/archive/20231010/[email protected]/config)
compiler: gcc-12 (Debian 12.2.0-14) 12.2.0
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20231010/[email protected]/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <[email protected]>
| Closes: https://lore.kernel.org/oe-kbuild-all/[email protected]/

All errors (new ones prefixed by >>):

In file included from scripts/sorttable.c:201:
>> scripts/sorttable.h:96:10: fatal error: asm/orc_types.h: No such file or directory
96 | #include <asm/orc_types.h>
| ^~~~~~~~~~~~~~~~~
compilation terminated.
make[3]: *** [scripts/Makefile.host:114: scripts/sorttable] Error 1 shuffle=2318162869
make[3]: Target 'scripts/' not remade because of errors.
make[2]: *** [Makefile:1186: scripts] Error 2 shuffle=2318162869
make[2]: Target 'prepare' not remade because of errors.
make[1]: *** [Makefile:234: __sub-make] Error 2 shuffle=2318162869
make[1]: Target 'prepare' not remade because of errors.
make: *** [Makefile:234: __sub-make] Error 2 shuffle=2318162869
make: Target 'prepare' not remade because of errors.


vim +96 scripts/sorttable.h

a79f248b9b309ebb scripts/sortextable.h David Daney 2012-04-19 93
57fa1899428538e3 scripts/sorttable.h Shile Zhang 2019-12-04 94 #if defined(SORTTABLE_64) && defined(UNWINDER_ORC_ENABLED)
57fa1899428538e3 scripts/sorttable.h Shile Zhang 2019-12-04 95 /* ORC unwinder only support X86_64 */
57fa1899428538e3 scripts/sorttable.h Shile Zhang 2019-12-04 @96 #include <asm/orc_types.h>
57fa1899428538e3 scripts/sorttable.h Shile Zhang 2019-12-04 97

--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

2023-10-11 04:41:32

by Huacai Chen

[permalink] [raw]
Subject: Re: [PATCH v2 8/8] LoongArch: Add ORC unwinder support

Hi, Tiezhu,

Maybe "LoongArch: Add ORC stack unwinder support" is better.

On Mon, Oct 9, 2023 at 9:03 PM Tiezhu Yang <[email protected]> wrote:
>
> The kernel CONFIG_UNWINDER_ORC option enables the ORC unwinder, which is
> similar in concept to a DWARF unwinder. The difference is that the format
> of the ORC data is much simpler than DWARF, which in turn allows the ORC
> unwinder to be much simpler and faster.
>
> The ORC data consists of unwind tables which are generated by objtool.
> They contain out-of-band data which is used by the in-kernel ORC unwinder.
> Objtool generates the ORC data by first doing compile-time stack metadata
> validation (CONFIG_STACK_VALIDATION). After analyzing all the code paths
> of a .o file, it determines information about the stack state at each
> instruction address in the file and outputs that information to the
> .orc_unwind and .orc_unwind_ip sections.
>
> The per-object ORC sections are combined at link time and are sorted and
> post-processed at boot time. The unwinder uses the resulting data to
> correlate instruction addresses with their stack states at run time.
>
> Most of the logic are similar with x86, in order to get ra info before ra
> is saved into stack, add ra_reg and ra_offset into orc_entry. At the same
> time, modify some arch-specific code to silence the objtool warnings.
>
> Co-developed-by: Jinyang He <[email protected]>
> Signed-off-by: Jinyang He <[email protected]>
> Co-developed-by: Youling Tang <[email protected]>
> Signed-off-by: Youling Tang <[email protected]>
> Signed-off-by: Tiezhu Yang <[email protected]>
> ---
> arch/loongarch/Kconfig | 2 +
> arch/loongarch/Kconfig.debug | 11 +
> arch/loongarch/Makefile | 23 ++
> arch/loongarch/configs/loongson3_defconfig | 1 +
> arch/loongarch/include/asm/Kbuild | 1 +
> arch/loongarch/include/asm/bug.h | 1 +
> arch/loongarch/include/asm/linkage.h | 2 +
> arch/loongarch/include/asm/module.h | 7 +
> arch/loongarch/include/asm/orc_header.h | 19 +
> arch/loongarch/include/asm/orc_lookup.h | 34 ++
> arch/loongarch/include/asm/orc_types.h | 58 +++
> arch/loongarch/include/asm/stackframe.h | 3 +
> arch/loongarch/include/asm/unwind.h | 22 +-
> arch/loongarch/include/asm/unwind_hints.h | 28 ++
> arch/loongarch/kernel/Makefile | 3 +
> arch/loongarch/kernel/entry.S | 9 +-
> arch/loongarch/kernel/genex.S | 20 +-
> arch/loongarch/kernel/head.S | 1 +
> arch/loongarch/kernel/module.c | 11 +-
> arch/loongarch/kernel/relocate_kernel.S | 2 +
> arch/loongarch/kernel/setup.c | 2 +
> arch/loongarch/kernel/stacktrace.c | 1 +
> arch/loongarch/kernel/unwind_orc.c | 571 +++++++++++++++++++++++++++++
> arch/loongarch/kernel/vmlinux.lds.S | 3 +
> arch/loongarch/lib/Makefile | 2 +
> arch/loongarch/mm/tlbex.S | 45 ++-
> arch/loongarch/power/Makefile | 2 +
> arch/loongarch/vdso/Makefile | 1 +
> include/linux/compiler.h | 9 +
> scripts/Makefile | 5 +-
> 30 files changed, 867 insertions(+), 32 deletions(-)
> create mode 100644 arch/loongarch/include/asm/orc_header.h
> create mode 100644 arch/loongarch/include/asm/orc_lookup.h
> create mode 100644 arch/loongarch/include/asm/orc_types.h
> create mode 100644 arch/loongarch/include/asm/unwind_hints.h
> create mode 100644 arch/loongarch/kernel/unwind_orc.c
>
> diff --git a/arch/loongarch/Kconfig b/arch/loongarch/Kconfig
> index e14396a..21ef3bb 100644
> --- a/arch/loongarch/Kconfig
> +++ b/arch/loongarch/Kconfig
> @@ -131,6 +131,7 @@ config LOONGARCH
> select HAVE_KRETPROBES
> select HAVE_MOD_ARCH_SPECIFIC
> select HAVE_NMI
> + select HAVE_OBJTOOL if AS_HAS_EXPLICIT_RELOCS
> select HAVE_PCI
> select HAVE_PERF_EVENTS
> select HAVE_PERF_REGS
> @@ -141,6 +142,7 @@ config LOONGARCH
> select HAVE_SAMPLE_FTRACE_DIRECT
> select HAVE_SAMPLE_FTRACE_DIRECT_MULTI
> select HAVE_SETUP_PER_CPU_AREA if NUMA
> + select HAVE_STACK_VALIDATION if HAVE_OBJTOOL
> select HAVE_STACKPROTECTOR
> select HAVE_SYSCALL_TRACEPOINTS
> select HAVE_TIF_NOHZ
> diff --git a/arch/loongarch/Kconfig.debug b/arch/loongarch/Kconfig.debug
> index 8d36aab..98d6063 100644
> --- a/arch/loongarch/Kconfig.debug
> +++ b/arch/loongarch/Kconfig.debug
> @@ -26,4 +26,15 @@ config UNWINDER_PROLOGUE
> Some of the addresses it reports may be incorrect (but better than the
> Guess unwinder).
>
> +config UNWINDER_ORC
> + bool "ORC unwinder"
> + select OBJTOOL
> + help
> + This option enables the ORC (Oops Rewind Capability) unwinder for
> + unwinding kernel stack traces. It uses a custom data format which is
> + a simplified version of the DWARF Call Frame Information standard.
> +
> + Enabling this option will increase the kernel's runtime memory usage
> + by roughly 2-4MB, depending on your kernel config.
> +
> endchoice
> diff --git a/arch/loongarch/Makefile b/arch/loongarch/Makefile
> index fb0fada..89a6e61 100644
> --- a/arch/loongarch/Makefile
> +++ b/arch/loongarch/Makefile
> @@ -25,6 +25,29 @@ endif
> 32bit-emul = elf32loongarch
> 64bit-emul = elf64loongarch
>
> +ifdef CONFIG_OBJTOOL
> +# https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=ecb802d02eeb
> +# https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=816029e06768
> +ifeq ($(shell as --help 2>&1 | grep -e '-mthin-add-sub'),)
> + $(error Sorry, you need a newer gas version with -mthin-add-sub option)
I prefer no error out here, because without this option we can still
built a runnable kernel.

> +endif
> +KBUILD_AFLAGS += $(call cc-option,-mthin-add-sub) $(call cc-option,-Wa$(comma)-mthin-add-sub)
> +KBUILD_CFLAGS += $(call cc-option,-mthin-add-sub) $(call cc-option,-Wa$(comma)-mthin-add-sub)
> +KBUILD_CFLAGS += -fno-optimize-sibling-calls -fno-jump-tables -falign-functions=4
> +endif
> +
> +ifdef CONFIG_UNWINDER_ORC
> +orc_hash_h := arch/$(SRCARCH)/include/generated/asm/orc_hash.h
> +orc_hash_sh := $(srctree)/scripts/orc_hash.sh
> +targets += $(orc_hash_h)
> +quiet_cmd_orc_hash = GEN $@
> + cmd_orc_hash = mkdir -p $(dir $@); \
> + $(CONFIG_SHELL) $(orc_hash_sh) < $< > $@
> +$(orc_hash_h): $(srctree)/arch/loongarch/include/asm/orc_types.h $(orc_hash_sh) FORCE
> + $(call if_changed,orc_hash)
> +archprepare: $(orc_hash_h)
> +endif
> +
> ifdef CONFIG_DYNAMIC_FTRACE
> KBUILD_CPPFLAGS += -DCC_USING_PATCHABLE_FUNCTION_ENTRY
> CC_FLAGS_FTRACE := -fpatchable-function-entry=2
> diff --git a/arch/loongarch/configs/loongson3_defconfig b/arch/loongarch/configs/loongson3_defconfig
> index a3b52aa..de911c3 100644
> --- a/arch/loongarch/configs/loongson3_defconfig
> +++ b/arch/loongarch/configs/loongson3_defconfig
> @@ -5,6 +5,7 @@ CONFIG_NO_HZ=y
> CONFIG_HIGH_RES_TIMERS=y
> CONFIG_BPF_SYSCALL=y
> CONFIG_BPF_JIT=y
> +CONFIG_BPF_JIT_ALWAYS_ON=y
> CONFIG_PREEMPT=y
> CONFIG_BSD_PROCESS_ACCT=y
> CONFIG_BSD_PROCESS_ACCT_V3=y
> diff --git a/arch/loongarch/include/asm/Kbuild b/arch/loongarch/include/asm/Kbuild
> index 93783fa..2bb285c 100644
> --- a/arch/loongarch/include/asm/Kbuild
> +++ b/arch/loongarch/include/asm/Kbuild
> @@ -1,4 +1,5 @@
> # SPDX-License-Identifier: GPL-2.0
> +generated-y += orc_hash.h
> generic-y += dma-contiguous.h
> generic-y += mcs_spinlock.h
> generic-y += parport.h
> diff --git a/arch/loongarch/include/asm/bug.h b/arch/loongarch/include/asm/bug.h
> index d4ca3ba..0838887 100644
> --- a/arch/loongarch/include/asm/bug.h
> +++ b/arch/loongarch/include/asm/bug.h
> @@ -44,6 +44,7 @@
> do { \
> instrumentation_begin(); \
> __BUG_FLAGS(BUGFLAG_WARNING|(flags)); \
> + annotate_reachable(); \
> instrumentation_end(); \
> } while (0)
>
> diff --git a/arch/loongarch/include/asm/linkage.h b/arch/loongarch/include/asm/linkage.h
> index 81b0c4c..ae4e100 100644
> --- a/arch/loongarch/include/asm/linkage.h
> +++ b/arch/loongarch/include/asm/linkage.h
> @@ -2,6 +2,8 @@
> #ifndef __ASM_LINKAGE_H
> #define __ASM_LINKAGE_H
>
> +#include <asm/unwind_hints.h>
> +
> #define __ALIGN .align 2
> #define __ALIGN_STR __stringify(__ALIGN)
>
> diff --git a/arch/loongarch/include/asm/module.h b/arch/loongarch/include/asm/module.h
> index 2ecd82b..96af0ba 100644
> --- a/arch/loongarch/include/asm/module.h
> +++ b/arch/loongarch/include/asm/module.h
> @@ -6,6 +6,7 @@
> #define _ASM_MODULE_H
>
> #include <asm/inst.h>
> +#include <asm/orc_types.h>
> #include <asm-generic/module.h>
>
> #define RELA_STACK_DEPTH 16
> @@ -23,6 +24,12 @@ struct mod_arch_specific {
>
> /* For CONFIG_DYNAMIC_FTRACE */
> struct plt_entry *ftrace_trampolines;
> +
> +#ifdef CONFIG_UNWINDER_ORC
> + unsigned int num_orcs;
> + int *orc_unwind_ip;
> + struct orc_entry *orc_unwind;
> +#endif
> };
>
> struct got_entry {
> diff --git a/arch/loongarch/include/asm/orc_header.h b/arch/loongarch/include/asm/orc_header.h
> new file mode 100644
> index 0000000..07bacf3
> --- /dev/null
> +++ b/arch/loongarch/include/asm/orc_header.h
> @@ -0,0 +1,19 @@
> +/* SPDX-License-Identifier: GPL-2.0-or-later */
> +/* Copyright (c) Meta Platforms, Inc. and affiliates. */
> +
> +#ifndef _ORC_HEADER_H
> +#define _ORC_HEADER_H
> +
> +#include <linux/types.h>
> +#include <linux/compiler.h>
> +#include <asm/orc_hash.h>
> +
> +/*
> + * The header is currently a 20-byte hash of the ORC entry definition; see
> + * scripts/orc_hash.sh.
> + */
> +#define ORC_HEADER \
> + __used __section(".orc_header") __aligned(4) \
> + static const u8 orc_header[] = { ORC_HASH }
> +
> +#endif /* _ORC_HEADER_H */
> diff --git a/arch/loongarch/include/asm/orc_lookup.h b/arch/loongarch/include/asm/orc_lookup.h
> new file mode 100644
> index 0000000..2416312
> --- /dev/null
> +++ b/arch/loongarch/include/asm/orc_lookup.h
> @@ -0,0 +1,34 @@
> +/* SPDX-License-Identifier: GPL-2.0-or-later */
> +/*
> + * Copyright (C) 2017 Josh Poimboeuf <[email protected]>
> + */
> +#ifndef _ORC_LOOKUP_H
> +#define _ORC_LOOKUP_H
> +
> +/*
> + * This is a lookup table for speeding up access to the .orc_unwind table.
> + * Given an input address offset, the corresponding lookup table entry
> + * specifies a subset of the .orc_unwind table to search.
> + *
> + * Each block represents the end of the previous range and the start of the
> + * next range. An extra block is added to give the last range an end.
> + *
> + * The block size should be a power of 2 to avoid a costly 'div' instruction.
> + *
> + * A block size of 256 was chosen because it roughly doubles unwinder
> + * performance while only adding ~5% to the ORC data footprint.
> + */
> +#define LOOKUP_BLOCK_ORDER 8
> +#define LOOKUP_BLOCK_SIZE (1 << LOOKUP_BLOCK_ORDER)
> +
> +#ifndef LINKER_SCRIPT
> +
> +extern unsigned int orc_lookup[];
> +extern unsigned int orc_lookup_end[];
> +
> +#define LOOKUP_START_IP (unsigned long)_stext
> +#define LOOKUP_STOP_IP (unsigned long)_etext
> +
> +#endif /* LINKER_SCRIPT */
> +
> +#endif /* _ORC_LOOKUP_H */
> diff --git a/arch/loongarch/include/asm/orc_types.h b/arch/loongarch/include/asm/orc_types.h
> new file mode 100644
> index 0000000..1d37e62
> --- /dev/null
> +++ b/arch/loongarch/include/asm/orc_types.h
> @@ -0,0 +1,58 @@
> +/* SPDX-License-Identifier: GPL-2.0-or-later */
> +#ifndef _ORC_TYPES_H
> +#define _ORC_TYPES_H
> +
> +#include <linux/types.h>
> +
> +/*
> + * The ORC_REG_* registers are base registers which are used to find other
> + * registers on the stack.
> + *
> + * ORC_REG_PREV_SP, also known as DWARF Call Frame Address (CFA), is the
> + * address of the previous frame: the caller's SP before it called the current
> + * function.
> + *
> + * ORC_REG_UNDEFINED means the corresponding register's value didn't change in
> + * the current frame.
> + *
> + * The most commonly used base registers are SP and BP -- which the previous SP
> + * is usually based on -- and PREV_SP and UNDEFINED -- which the previous BP is
> + * usually based on.
> + *
> + * The rest of the base registers are needed for special cases like entry code
> + * and GCC realigned stacks.
> + */
> +#define ORC_REG_UNDEFINED 0
> +#define ORC_REG_PREV_SP 1
> +#define ORC_REG_SP 2
> +#define ORC_REG_BP 3
Use FP instead of BP in this patch, too.

> +#define ORC_REG_MAX 4
> +
> +#define ORC_TYPE_UNDEFINED 0
> +#define ORC_TYPE_END_OF_STACK 1
> +#define ORC_TYPE_CALL 2
> +#define ORC_TYPE_REGS 3
> +#define ORC_TYPE_REGS_PARTIAL 4
> +
> +#ifndef __ASSEMBLY__
> +/*
> + * This struct is more or less a vastly simplified version of the DWARF Call
> + * Frame Information standard. It contains only the necessary parts of DWARF
> + * CFI, simplified for ease of access by the in-kernel unwinder. It tells the
> + * unwinder how to find the previous SP and BP (and sometimes entry regs) on
> + * the stack for a given code address. Each instance of the struct corresponds
> + * to one or more code locations.
> + */
> +struct orc_entry {
> + s16 sp_offset;
> + s16 bp_offset;
> + s16 ra_offset;
> + unsigned int sp_reg:4;
> + unsigned int bp_reg:4;
> + unsigned int ra_reg:4;
> + unsigned int type:3;
> + unsigned int signal:1;
> +};
> +#endif /* __ASSEMBLY__ */
> +
> +#endif /* _ORC_TYPES_H */
> diff --git a/arch/loongarch/include/asm/stackframe.h b/arch/loongarch/include/asm/stackframe.h
> index 4fb1e64..45b507a 100644
> --- a/arch/loongarch/include/asm/stackframe.h
> +++ b/arch/loongarch/include/asm/stackframe.h
> @@ -13,6 +13,7 @@
> #include <asm/asm-offsets.h>
> #include <asm/loongarch.h>
> #include <asm/thread_info.h>
> +#include <asm/unwind_hints.h>
>
> /* Make the addition of cfi info a little easier. */
> .macro cfi_rel_offset reg offset=0 docfi=0
> @@ -162,6 +163,7 @@
> li.w t0, CSR_CRMD_WE
> csrxchg t0, t0, LOONGARCH_CSR_CRMD
> #endif
> + UNWIND_HINT_REGS
> .endm
>
> .macro SAVE_ALL docfi=0
> @@ -219,6 +221,7 @@
>
> .macro RESTORE_SP_AND_RET docfi=0
> cfi_ld sp, PT_R3, \docfi
> + UNWIND_HINT_FUNC
> ertn
> .endm
>
> diff --git a/arch/loongarch/include/asm/unwind.h b/arch/loongarch/include/asm/unwind.h
> index b9dce87..d36e04e 100644
> --- a/arch/loongarch/include/asm/unwind.h
> +++ b/arch/loongarch/include/asm/unwind.h
> @@ -16,6 +16,7 @@
> enum unwinder_type {
> UNWINDER_GUESS,
> UNWINDER_PROLOGUE,
> + UNWINDER_ORC,
> };
>
> struct unwind_state {
> @@ -24,7 +25,7 @@ struct unwind_state {
> struct task_struct *task;
> bool first, error, reset;
> int graph_idx;
> - unsigned long sp, pc, ra;
> + unsigned long sp, pc, ra, fp;
> };
>
> bool default_next_frame(struct unwind_state *state);
> @@ -34,6 +35,17 @@ void unwind_start(struct unwind_state *state,
> bool unwind_next_frame(struct unwind_state *state);
> unsigned long unwind_get_return_address(struct unwind_state *state);
>
> +#ifdef CONFIG_UNWINDER_ORC
> +void unwind_init(void);
> +void unwind_module_init(struct module *mod, void *orc_ip, size_t orc_ip_size,
> + void *orc, size_t orc_size);
> +#else
> +static inline void unwind_init(void) {}
> +static inline
> +void unwind_module_init(struct module *mod, void *orc_ip, size_t orc_ip_size,
> + void *orc, size_t orc_size) {}
> +#endif
> +
> static inline bool unwind_done(struct unwind_state *state)
> {
> return state->stack_info.type == STACK_TYPE_UNKNOWN;
> @@ -61,14 +73,17 @@ static __always_inline void __unwind_start(struct unwind_state *state,
> state->sp = regs->regs[3];
> state->pc = regs->csr_era;
> state->ra = regs->regs[1];
> + state->fp = regs->regs[22];
> } else if (task && task != current) {
> state->sp = thread_saved_fp(task);
> state->pc = thread_saved_ra(task);
> state->ra = 0;
> + state->fp = 0;
> } else {
> state->sp = (unsigned long)__builtin_frame_address(0);
> state->pc = (unsigned long)__builtin_return_address(0);
> state->ra = 0;
> + state->fp = 0;
> }
> state->task = task;
> get_stack_info(state->sp, state->task, &state->stack_info);
> @@ -77,6 +92,9 @@ static __always_inline void __unwind_start(struct unwind_state *state,
>
> static __always_inline unsigned long __unwind_get_return_address(struct unwind_state *state)
> {
> - return unwind_done(state) ? 0 : state->pc;
> + if (unwind_done(state))
> + return 0;
> +
> + return __kernel_text_address(state->pc) ? state->pc : 0;
> }
> #endif /* _ASM_UNWIND_H */
> diff --git a/arch/loongarch/include/asm/unwind_hints.h b/arch/loongarch/include/asm/unwind_hints.h
> new file mode 100644
> index 0000000..82443fe
> --- /dev/null
> +++ b/arch/loongarch/include/asm/unwind_hints.h
> @@ -0,0 +1,28 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +#ifndef _ASM_LOONGARCH_UNWIND_HINTS_H
> +#define _ASM_LOONGARCH_UNWIND_HINTS_H
> +
> +#include <linux/objtool.h>
> +#include <asm/orc_types.h>
> +
> +#ifdef __ASSEMBLY__
> +
> +.macro UNWIND_HINT_UNDEFINED

> + UNWIND_HINT type=UNWIND_HINT_TYPE_UNDEFINED
> +.endm
We don't need to set sp_reg=ORC_REG_UNDEFINED for UNWIND_HINT_UNDEFINED?

> +
> +.macro UNWIND_HINT_EMPTY
> + UNWIND_HINT sp_reg=ORC_REG_UNDEFINED type=UNWIND_HINT_TYPE_CALL
> +.endm
We don't need to define UNWIND_HINT_END_OF_STACK?

> +
> +.macro UNWIND_HINT_REGS
> + UNWIND_HINT sp_reg=ORC_REG_SP type=UNWIND_HINT_TYPE_REGS
> +.endm
> +
> +.macro UNWIND_HINT_FUNC
> + UNWIND_HINT sp_reg=ORC_REG_SP type=UNWIND_HINT_TYPE_CALL
> +.endm
We don't need to set sp_offset for UNWIND_HINT_REGS and UNWIND_HINT_FUNC?

> +
> +#endif /* __ASSEMBLY__ */
> +
> +#endif /* _ASM_LOONGARCH_UNWIND_HINTS_H */
> diff --git a/arch/loongarch/kernel/Makefile b/arch/loongarch/kernel/Makefile
> index 4fcc168..a89428c 100644
> --- a/arch/loongarch/kernel/Makefile
> +++ b/arch/loongarch/kernel/Makefile
> @@ -3,6 +3,8 @@
> # Makefile for the Linux/LoongArch kernel.
> #
>
> +OBJECT_FILES_NON_STANDARD_head.o := y
> +
> extra-y := vmlinux.lds
>
> obj-y += head.o cpu-probe.o cacheinfo.o env.o setup.o entry.o genex.o \
> @@ -62,6 +64,7 @@ obj-$(CONFIG_CRASH_DUMP) += crash_dump.o
>
> obj-$(CONFIG_UNWINDER_GUESS) += unwind_guess.o
> obj-$(CONFIG_UNWINDER_PROLOGUE) += unwind_prologue.o
> +obj-$(CONFIG_UNWINDER_ORC) += unwind_orc.o
>
> obj-$(CONFIG_PERF_EVENTS) += perf_event.o perf_regs.o
> obj-$(CONFIG_HAVE_HW_BREAKPOINT) += hw_breakpoint.o
> diff --git a/arch/loongarch/kernel/entry.S b/arch/loongarch/kernel/entry.S
> index 65518bb..e43115f 100644
> --- a/arch/loongarch/kernel/entry.S
> +++ b/arch/loongarch/kernel/entry.S
> @@ -14,11 +14,13 @@
> #include <asm/regdef.h>
> #include <asm/stackframe.h>
> #include <asm/thread_info.h>
> +#include <asm/unwind_hints.h>
>
> .text
> .cfi_sections .debug_frame
> .align 5
> -SYM_FUNC_START(handle_syscall)
> +SYM_CODE_START(handle_syscall)
Why?

> + UNWIND_HINT_UNDEFINED
> csrrd t0, PERCPU_BASE_KS
> la.pcrel t1, kernelsp
> add.d t1, t1, t0
> @@ -56,6 +58,7 @@ SYM_FUNC_START(handle_syscall)
> cfi_st u0, PT_R21
> cfi_st fp, PT_R22
>
> + UNWIND_HINT_REGS
> SAVE_STATIC
>
> #ifdef CONFIG_KGDB
> @@ -71,10 +74,11 @@ SYM_FUNC_START(handle_syscall)
> bl do_syscall
>
> RESTORE_ALL_AND_RET
> -SYM_FUNC_END(handle_syscall)
> +SYM_CODE_END(handle_syscall)
> _ASM_NOKPROBE(handle_syscall)
>
> SYM_CODE_START(ret_from_fork)
> + UNWIND_HINT_REGS
> bl schedule_tail # a0 = struct task_struct *prev
> move a0, sp
> bl syscall_exit_to_user_mode
> @@ -84,6 +88,7 @@ SYM_CODE_START(ret_from_fork)
> SYM_CODE_END(ret_from_fork)
>
> SYM_CODE_START(ret_from_kernel_thread)
> + UNWIND_HINT_REGS
> bl schedule_tail # a0 = struct task_struct *prev
> move a0, s1
> jirl ra, s0, 0
> diff --git a/arch/loongarch/kernel/genex.S b/arch/loongarch/kernel/genex.S
> index 78f0663..3f18e3b 100644
> --- a/arch/loongarch/kernel/genex.S
> +++ b/arch/loongarch/kernel/genex.S
> @@ -31,7 +31,8 @@ SYM_FUNC_START(__arch_cpu_idle)
> 1: jr ra
> SYM_FUNC_END(__arch_cpu_idle)
>
> -SYM_FUNC_START(handle_vint)
> +SYM_CODE_START(handle_vint)
> + UNWIND_HINT_UNDEFINED
> BACKUP_T0T1
> SAVE_ALL
> la_abs t1, __arch_cpu_idle
> @@ -46,11 +47,12 @@ SYM_FUNC_START(handle_vint)
> la_abs t0, do_vint
> jirl ra, t0, 0
> RESTORE_ALL_AND_RET
> -SYM_FUNC_END(handle_vint)
> +SYM_CODE_END(handle_vint)
>
> -SYM_FUNC_START(except_vec_cex)
> +SYM_CODE_START(except_vec_cex)
> + UNWIND_HINT_UNDEFINED
> b cache_parity_error
> -SYM_FUNC_END(except_vec_cex)
> +SYM_CODE_END(except_vec_cex)
>
> .macro build_prep_badv
> csrrd t0, LOONGARCH_CSR_BADV
> @@ -66,7 +68,8 @@ SYM_FUNC_END(except_vec_cex)
>
> .macro BUILD_HANDLER exception handler prep
> .align 5
> - SYM_FUNC_START(handle_\exception)
> + SYM_CODE_START(handle_\exception)
> + UNWIND_HINT_UNDEFINED
> 666:
> BACKUP_T0T1
> SAVE_ALL
> @@ -76,7 +79,7 @@ SYM_FUNC_END(except_vec_cex)
> jirl ra, t0, 0
> 668:
> RESTORE_ALL_AND_RET
> - SYM_FUNC_END(handle_\exception)
> + SYM_CODE_END(handle_\exception)
> SYM_DATA(unwind_hint_\exception, .word 668b - 666b)
> .endm
>
> @@ -93,7 +96,8 @@ SYM_FUNC_END(except_vec_cex)
> BUILD_HANDLER watch watch none
> BUILD_HANDLER reserved reserved none /* others */
>
> -SYM_FUNC_START(handle_sys)
> +SYM_CODE_START(handle_sys)
> + UNWIND_HINT_UNDEFINED
> la_abs t0, handle_syscall
> jr t0
> -SYM_FUNC_END(handle_sys)
> +SYM_CODE_END(handle_sys)
> diff --git a/arch/loongarch/kernel/head.S b/arch/loongarch/kernel/head.S
> index 53b883d..5664390 100644
> --- a/arch/loongarch/kernel/head.S
> +++ b/arch/loongarch/kernel/head.S
> @@ -43,6 +43,7 @@ SYM_DATA(kernel_offset, .long _kernel_offset);
> .align 12
>
> SYM_CODE_START(kernel_entry) # kernel entry point
> + UNWIND_HINT_EMPTY
I'm not sure but I think this isn't needed, because
"OBJECT_FILES_NON_STANDARD_head.o :=y"

>
> /* Config direct window and set PG */
> li.d t0, CSR_DMW0_INIT # UC, PLV0, 0x8000 xxxx xxxx xxxx
> diff --git a/arch/loongarch/kernel/module.c b/arch/loongarch/kernel/module.c
> index b13b285..83db7e5 100644
> --- a/arch/loongarch/kernel/module.c
> +++ b/arch/loongarch/kernel/module.c
> @@ -20,6 +20,7 @@
> #include <linux/kernel.h>
> #include <asm/alternative.h>
> #include <asm/inst.h>
> +#include <asm/unwind.h>
>
> static int rela_stack_push(s64 stack_value, s64 *rela_stack, size_t *rela_stack_top)
> {
> @@ -515,7 +516,7 @@ static void module_init_ftrace_plt(const Elf_Ehdr *hdr,
> int module_finalize(const Elf_Ehdr *hdr,
> const Elf_Shdr *sechdrs, struct module *mod)
> {
> - const Elf_Shdr *s, *se;
> + const Elf_Shdr *s, *se, *orc = NULL, *orc_ip = NULL;
> const char *secstrs = (void *)hdr + sechdrs[hdr->e_shstrndx].sh_offset;
>
> for (s = sechdrs, se = sechdrs + hdr->e_shnum; s < se; s++) {
> @@ -523,7 +524,15 @@ int module_finalize(const Elf_Ehdr *hdr,
> apply_alternatives((void *)s->sh_addr, (void *)s->sh_addr + s->sh_size);
> if (!strcmp(".ftrace_trampoline", secstrs + s->sh_name))
> module_init_ftrace_plt(hdr, s, mod);
> + if (!strcmp(".orc_unwind", secstrs + s->sh_name))
> + orc = s;
> + if (!strcmp(".orc_unwind_ip", secstrs + s->sh_name))
> + orc_ip = s;
> }
>
> + if (orc && orc_ip)
> + unwind_module_init(mod, (void *)orc_ip->sh_addr, orc_ip->sh_size,
> + (void *)orc->sh_addr, orc->sh_size);
> +
> return 0;
> }
> diff --git a/arch/loongarch/kernel/relocate_kernel.S b/arch/loongarch/kernel/relocate_kernel.S
> index f49f6b0..bcc191d 100644
> --- a/arch/loongarch/kernel/relocate_kernel.S
> +++ b/arch/loongarch/kernel/relocate_kernel.S
> @@ -15,6 +15,7 @@
> #include <asm/addrspace.h>
>
> SYM_CODE_START(relocate_new_kernel)
> + UNWIND_HINT_UNDEFINED
> /*
> * a0: EFI boot flag for the new kernel
> * a1: Command line pointer for the new kernel
> @@ -90,6 +91,7 @@ SYM_CODE_END(relocate_new_kernel)
> * then start at the entry point from LOONGARCH_IOCSR_MBUF0.
> */
> SYM_CODE_START(kexec_smp_wait)
> + UNWIND_HINT_UNDEFINED
> 1: li.w t0, 0x100 /* wait for init loop */
> 2: addi.w t0, t0, -1 /* limit mailbox access */
> bnez t0, 2b
> diff --git a/arch/loongarch/kernel/setup.c b/arch/loongarch/kernel/setup.c
> index 7783f0a..a173b02 100644
> --- a/arch/loongarch/kernel/setup.c
> +++ b/arch/loongarch/kernel/setup.c
> @@ -48,6 +48,7 @@
> #include <asm/sections.h>
> #include <asm/setup.h>
> #include <asm/time.h>
> +#include <asm/unwind.h>
>
> #define SMBIOS_BIOSSIZE_OFFSET 0x09
> #define SMBIOS_BIOSEXTERN_OFFSET 0x13
> @@ -605,6 +606,7 @@ static void __init prefill_possible_map(void)
>
> void __init setup_arch(char **cmdline_p)
> {
> + unwind_init();
I think this line should be after cpu_probe().

> cpu_probe();
>
> init_environ();
> diff --git a/arch/loongarch/kernel/stacktrace.c b/arch/loongarch/kernel/stacktrace.c
> index 92270f1..9848d42 100644
> --- a/arch/loongarch/kernel/stacktrace.c
> +++ b/arch/loongarch/kernel/stacktrace.c
> @@ -29,6 +29,7 @@ void arch_stack_walk(stack_trace_consume_fn consume_entry, void *cookie,
> regs->csr_era = thread_saved_ra(task);
> }
> regs->regs[1] = 0;
> + regs->regs[22] = 0;
> }
>
> for (unwind_start(&state, task, regs);
> diff --git a/arch/loongarch/kernel/unwind_orc.c b/arch/loongarch/kernel/unwind_orc.c
> new file mode 100644
> index 0000000..08f80ca0
> --- /dev/null
> +++ b/arch/loongarch/kernel/unwind_orc.c
> @@ -0,0 +1,571 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +#include <linux/objtool.h>
> +#include <linux/module.h>
> +#include <linux/sort.h>
> +#include <asm/exception.h>
> +#include <asm/orc_types.h>
> +#include <asm/orc_lookup.h>
> +#include <asm/orc_header.h>
> +#include <asm/ptrace.h>
> +#include <asm/setup.h>
> +#include <asm/stacktrace.h>
> +#include <asm/tlb.h>
> +#include <asm/unwind.h>
> +
> +ORC_HEADER;
> +
> +#define orc_warn(fmt, ...) \
> + printk_deferred_once(KERN_WARNING "WARNING: " fmt, ##__VA_ARGS__)
> +
> +extern int __start_orc_unwind_ip[];
> +extern int __stop_orc_unwind_ip[];
> +extern struct orc_entry __start_orc_unwind[];
> +extern struct orc_entry __stop_orc_unwind[];
> +
> +static bool orc_init __ro_after_init;
> +static unsigned int lookup_num_blocks __ro_after_init;
> +
> +/* Fake frame pointer entry -- used as a fallback for generated code */
> +static struct orc_entry orc_fp_entry = {
> + .type = UNWIND_HINT_TYPE_CALL,
> + .sp_reg = ORC_REG_BP,
> + .sp_offset = 16,
> + .bp_reg = ORC_REG_PREV_SP,
> + .bp_offset = -16,
> + .ra_reg = ORC_REG_PREV_SP,
> + .ra_offset = -8,
> +};
> +
> +static inline unsigned long orc_ip(const int *ip)
> +{
> + return (unsigned long)ip + *ip;
> +}
> +
> +static struct orc_entry *__orc_find(int *ip_table, struct orc_entry *u_table,
> + unsigned int num_entries, unsigned long ip)
> +{
> + int *first = ip_table;
> + int *last = ip_table + num_entries - 1;
> + int *mid = first, *found = first;
> +
> + if (!num_entries)
> + return NULL;
> +
> + /*
> + * Do a binary range search to find the rightmost duplicate of a given
> + * starting address. Some entries are section terminators which are
> + * "weak" entries for ensuring there are no gaps. They should be
> + * ignored when they conflict with a real entry.
> + */
> + while (first <= last) {
> + mid = first + ((last - first) / 2);
> +
> + if (orc_ip(mid) <= ip) {
> + found = mid;
> + first = mid + 1;
> + } else
> + last = mid - 1;
> + }
> +
> + return u_table + (found - ip_table);
> +}
> +
> +#ifdef CONFIG_MODULES
> +static struct orc_entry *orc_module_find(unsigned long ip)
> +{
> + struct module *mod;
> +
> + mod = __module_address(ip);
> + if (!mod || !mod->arch.orc_unwind || !mod->arch.orc_unwind_ip)
> + return NULL;
> + return __orc_find(mod->arch.orc_unwind_ip, mod->arch.orc_unwind,
> + mod->arch.num_orcs, ip);
> +}
> +#else
> +static struct orc_entry *orc_module_find(unsigned long ip)
> +{
> + return NULL;
> +}
> +#endif
> +
> +#ifdef CONFIG_DYNAMIC_FTRACE
> +static struct orc_entry *orc_find(unsigned long ip);
> +
> +/*
> + * Ftrace dynamic trampolines do not have orc entries of their own.
> + * But they are copies of the ftrace entries that are static and
> + * defined in ftrace_*.S, which do have orc entries.
> + *
> + * If the unwinder comes across a ftrace trampoline, then find the
> + * ftrace function that was used to create it, and use that ftrace
> + * function's orc entry, as the placement of the return code in
> + * the stack will be identical.
> + */
> +static struct orc_entry *orc_ftrace_find(unsigned long ip)
> +{
> + struct ftrace_ops *ops;
> + unsigned long tramp_addr, offset;
> +
> + ops = ftrace_ops_trampoline(ip);
> + if (!ops)
> + return NULL;
> +
> + /* Set tramp_addr to the start of the code copied by the trampoline */
> + if (ops->flags & FTRACE_OPS_FL_SAVE_REGS)
> + tramp_addr = (unsigned long)ftrace_regs_caller;
> + else
> + tramp_addr = (unsigned long)ftrace_caller;
> +
> + /* Now place tramp_addr to the location within the trampoline ip is at */
> + offset = ip - ops->trampoline;
> + tramp_addr += offset;
> +
> + /* Prevent unlikely recursion */
> + if (ip == tramp_addr)
> + return NULL;
> +
> + return orc_find(tramp_addr);
> +}
> +#else
> +static struct orc_entry *orc_ftrace_find(unsigned long ip)
> +{
> + return NULL;
> +}
> +#endif
> +
> +/*
> + * If we crash with IP==0, the last successfully executed instruction
> + * was probably an indirect function call with a NULL function pointer,
> + * and we don't have unwind information for NULL.
> + * This hardcoded ORC entry for IP==0 allows us to unwind from a NULL function
> + * pointer into its parent and then continue normally from there.
> + */
> +static struct orc_entry null_orc_entry = {
> + .sp_offset = sizeof(long),
> + .sp_reg = ORC_REG_SP,
> + .bp_reg = ORC_REG_UNDEFINED,
> + .type = ORC_TYPE_CALL
> +};
> +
> +static struct orc_entry *orc_find(unsigned long ip)
> +{
> + static struct orc_entry *orc;
> +
> + if (ip == 0)
> + return &null_orc_entry;
> +
> + /* For non-init vmlinux addresses, use the fast lookup table: */
> + if (ip >= LOOKUP_START_IP && ip < LOOKUP_STOP_IP) {
> + unsigned int idx, start, stop;
> +
> + idx = (ip - LOOKUP_START_IP) / LOOKUP_BLOCK_SIZE;
> +
> + if (unlikely((idx >= lookup_num_blocks-1))) {
> + orc_warn("WARNING: bad lookup idx: idx=%u num=%u ip=%pB\n",
> + idx, lookup_num_blocks, (void *)ip);
> + return NULL;
> + }
> +
> + start = orc_lookup[idx];
> + stop = orc_lookup[idx + 1] + 1;
> +
> + if (unlikely((__start_orc_unwind + start >= __stop_orc_unwind) ||
> + (__start_orc_unwind + stop > __stop_orc_unwind))) {
> + orc_warn("WARNING: bad lookup value: idx=%u num=%u start=%u stop=%u ip=%pB\n",
> + idx, lookup_num_blocks, start, stop, (void *)ip);
> + return NULL;
> + }
> +
> + return __orc_find(__start_orc_unwind_ip + start,
> + __start_orc_unwind + start, stop - start, ip);
> + }
> +
> + /* vmlinux .init slow lookup: */
> + if (is_kernel_inittext(ip))
> + return __orc_find(__start_orc_unwind_ip, __start_orc_unwind,
> + __stop_orc_unwind_ip - __start_orc_unwind_ip, ip);
> +
> + /* Module lookup: */
> + orc = orc_module_find(ip);
> + if (orc)
> + return orc;
> +
> + return orc_ftrace_find(ip);
> +}
> +
> +#ifdef CONFIG_MODULES
> +
> +static DEFINE_MUTEX(sort_mutex);
> +static int *cur_orc_ip_table = __start_orc_unwind_ip;
> +static struct orc_entry *cur_orc_table = __start_orc_unwind;
> +
> +static void orc_sort_swap(void *_a, void *_b, int size)
> +{
> + struct orc_entry *orc_a, *orc_b;
> + int *a = _a, *b = _b, tmp;
> + int delta = _b - _a;
> +
> + /* Swap the .orc_unwind_ip entries: */
> + tmp = *a;
> + *a = *b + delta;
> + *b = tmp - delta;
> +
> + /* Swap the corresponding .orc_unwind entries: */
> + orc_a = cur_orc_table + (a - cur_orc_ip_table);
> + orc_b = cur_orc_table + (b - cur_orc_ip_table);
> + swap(*orc_a, *orc_b);
> +}
> +
> +static int orc_sort_cmp(const void *_a, const void *_b)
> +{
> + struct orc_entry *orc_a;
> + const int *a = _a, *b = _b;
> + unsigned long a_val = orc_ip(a);
> + unsigned long b_val = orc_ip(b);
> +
> + if (a_val > b_val)
> + return 1;
> + if (a_val < b_val)
> + return -1;
> +
> + /*
> + * The "weak" section terminator entries need to always be first
> + * to ensure the lookup code skips them in favor of real entries.
> + * These terminator entries exist to handle any gaps created by
> + * whitelisted .o files which didn't get objtool generation.
> + */
> + orc_a = cur_orc_table + (a - cur_orc_ip_table);
> + return orc_a->type == ORC_TYPE_UNDEFINED ? -1 : 1;
> +}
> +
> +void unwind_module_init(struct module *mod, void *_orc_ip, size_t orc_ip_size,
> + void *_orc, size_t orc_size)
> +{
> + int *orc_ip = _orc_ip;
> + struct orc_entry *orc = _orc;
> + unsigned int num_entries = orc_ip_size / sizeof(int);
> +
> + WARN_ON_ONCE(orc_ip_size % sizeof(int) != 0 ||
> + orc_size % sizeof(*orc) != 0 ||
> + num_entries != orc_size / sizeof(*orc));
> +
> + /*
> + * The 'cur_orc_*' globals allow the orc_sort_swap() callback to
> + * associate an .orc_unwind_ip table entry with its corresponding
> + * .orc_unwind entry so they can both be swapped.
> + */
> + mutex_lock(&sort_mutex);
> + cur_orc_ip_table = orc_ip;
> + cur_orc_table = orc;
> + sort(orc_ip, num_entries, sizeof(int), orc_sort_cmp, orc_sort_swap);
> + mutex_unlock(&sort_mutex);
> +
> + mod->arch.orc_unwind_ip = orc_ip;
> + mod->arch.orc_unwind = orc;
> + mod->arch.num_orcs = num_entries;
> +}
> +#endif
> +
> +void __init unwind_init(void)
> +{
> + size_t orc_ip_size = (void *)__stop_orc_unwind_ip - (void *)__start_orc_unwind_ip;
> + size_t orc_size = (void *)__stop_orc_unwind - (void *)__start_orc_unwind;
> + size_t num_entries = orc_ip_size / sizeof(int);
> + struct orc_entry *orc;
> + int i;
> +
> + if (!num_entries || orc_ip_size % sizeof(int) != 0 ||
> + orc_size % sizeof(struct orc_entry) != 0 ||
> + num_entries != orc_size / sizeof(struct orc_entry)) {
> + orc_warn("WARNING: Bad or missing .orc_unwind table. Disabling unwinder.\n");
> + return;
> + }
> +
> + /*
> + * Note, the orc_unwind and orc_unwind_ip tables were already
> + * sorted at build time via the 'sorttable' tool.
> + * It's ready for binary search straight away, no need to sort it.
> + */
> +
> + /* Initialize the fast lookup table: */
> + lookup_num_blocks = orc_lookup_end - orc_lookup;
> + for (i = 0; i < lookup_num_blocks-1; i++) {
> + orc = __orc_find(__start_orc_unwind_ip, __start_orc_unwind,
> + num_entries,
> + LOOKUP_START_IP + (LOOKUP_BLOCK_SIZE * i));
> + if (!orc) {
> + orc_warn("WARNING: Corrupt .orc_unwind table. Disabling unwinder.\n");
> + return;
> + }
> +
> + orc_lookup[i] = orc - __start_orc_unwind;
> + }
> +
> + /* Initialize the ending block: */
> + orc = __orc_find(__start_orc_unwind_ip, __start_orc_unwind, num_entries,
> + LOOKUP_STOP_IP);
> + if (!orc) {
> + orc_warn("WARNING: Corrupt .orc_unwind table. Disabling unwinder.\n");
> + return;
> + }
> + orc_lookup[lookup_num_blocks-1] = orc - __start_orc_unwind;
> +
> + orc_init = true;
> +}
> +
> +static inline bool on_stack(struct stack_info *info, unsigned long addr, size_t len)
> +{
> + unsigned long begin = info->begin;
> + unsigned long end = info->end;
> +
> + return (info->type != STACK_TYPE_UNKNOWN &&
> + addr >= begin && addr < end &&
> + addr + len > begin && addr + len <= end);
> +}
> +
> +static bool stack_access_ok(struct unwind_state *state, unsigned long addr,
> + size_t len)
> +{
> + struct stack_info *info = &state->stack_info;
> +
> + if (on_stack(info, addr, len))
> + return true;
> +
> + return !get_stack_info(addr, state->task, info) &&
> + on_stack(info, addr, len);
> +}
> +
> +unsigned long unwind_get_return_address(struct unwind_state *state)
> +{
> + return __unwind_get_return_address(state);
> +}
> +EXPORT_SYMBOL_GPL(unwind_get_return_address);
> +
> +void unwind_start(struct unwind_state *state, struct task_struct *task,
> + struct pt_regs *regs)
> +{
> + __unwind_start(state, task, regs);
> + if (!unwind_done(state) && !__kernel_text_address(state->pc))
> + unwind_next_frame(state);
> +}
> +EXPORT_SYMBOL_GPL(unwind_start);
> +
> +static bool is_entry_func(unsigned long addr)
> +{
> + extern u32 kernel_entry;
> + extern u32 kernel_entry_end;
> +
> + return addr >= (unsigned long)&kernel_entry &&
> + addr < (unsigned long)&kernel_entry_end;
> +}
> +
> +static inline unsigned long bt_address(unsigned long ra)
> +{
> + extern unsigned long eentry;
> +
> + if (__kernel_text_address(ra))
> + return ra;
> +
> + /* We are in preempt_disable() here */
> + if (__module_text_address(ra))
> + return ra;
> +
> + if (ra >= eentry && ra < eentry + EXCCODE_INT_END * VECSIZE) {
> + unsigned long type = (ra - eentry) / VECSIZE;
> + unsigned long offset = (ra - eentry) % VECSIZE;
> + unsigned long func;
> +
> + switch (type) {
> + case EXCCODE_TLBL:
> + case EXCCODE_TLBI:
> + func = (unsigned long)handle_tlb_load;
> + break;
> + case EXCCODE_TLBS:
> + func = (unsigned long)handle_tlb_store;
> + break;
> + case EXCCODE_TLBM:
> + func = (unsigned long)handle_tlb_modify;
> + break;
> + case EXCCODE_TLBNR:
> + case EXCCODE_TLBNX:
> + case EXCCODE_TLBPE:
> + func = (unsigned long)handle_tlb_protect;
> + break;
> + case EXCCODE_ADE:
> + func = (unsigned long)handle_ade;
> + break;
> + case EXCCODE_ALE:
> + func = (unsigned long)handle_ale;
> + break;
> + case EXCCODE_BCE:
> + func = (unsigned long)handle_bce;
> + break;
> + case EXCCODE_SYS:
> + func = (unsigned long)handle_sys;
> + break;
> + case EXCCODE_BP:
> + func = (unsigned long)handle_bp;
> + break;
> + case EXCCODE_INE:
> + case EXCCODE_IPE:
> + func = (unsigned long)handle_ri;
> + break;
> + case EXCCODE_FPDIS:
> + func = (unsigned long)handle_fpu;
> + break;
> + case EXCCODE_LSXDIS:
> + func = (unsigned long)handle_lsx;
> + break;
> + case EXCCODE_LASXDIS:
> + func = (unsigned long)handle_lasx;
> + break;
> + case EXCCODE_FPE:
> + func = (unsigned long)handle_fpe;
> + break;
> + case EXCCODE_WATCH:
> + func = (unsigned long)handle_watch;
> + break;
> + case EXCCODE_BTDIS:
> + func = (unsigned long)handle_lbt;
> + break;
> + case EXCCODE_INT_START ... EXCCODE_INT_END - 1:
> + func = (unsigned long)handle_vint;
> + break;
> + default:
> + func = (unsigned long)handle_reserved;
> + break;
> + }
> +
> + return func + offset;
> + }
> +
> + return ra;
> +}
> +
> +bool unwind_next_frame(struct unwind_state *state)
> +{
> + struct stack_info *info = &state->stack_info;
> + struct orc_entry *orc;
> + struct pt_regs *regs;
> + unsigned long *p, pc;
> +
> + if (unwind_done(state))
> + return false;
> +
> + /* Don't let modules unload while we're reading their ORC data. */
> + preempt_disable();
> +
> + if (is_entry_func(state->pc))
> + goto end;
> +
> + orc = orc_find(state->pc);
> + if (!orc) {
> + orc = &orc_fp_entry;
> + state->error = true;
> + }
> +
> + switch (orc->sp_reg) {
> + case ORC_REG_SP:
> + state->sp = state->sp + orc->sp_offset;
> + break;
> + case ORC_REG_BP:
> + state->sp = state->fp;
> + break;
> + default:
> + orc_warn("unknown SP base reg %d at %pB\n",
> + orc->sp_reg, (void *)state->pc);
> + goto err;
> + }
> +
> + switch (orc->bp_reg) {
> + case ORC_REG_PREV_SP:
> + p = (unsigned long *)(state->sp + orc->bp_offset);
> + if (!stack_access_ok(state, (unsigned long)p, sizeof(unsigned long)))
> + goto err;
> +
> + state->fp = *p;
> + break;
> + case ORC_REG_UNDEFINED:
> + /* Nothing. */
> + break;
> + default:
> + orc_warn("unknown FP base reg %d at %pB\n",
> + orc->bp_reg, (void *)state->pc);
> + goto err;
> + }
> +
> + switch (orc->type) {
> + case UNWIND_HINT_TYPE_CALL:
> + if (orc->ra_reg == ORC_REG_PREV_SP) {
> + p = (unsigned long *)(state->sp + orc->ra_offset);
> + if (!stack_access_ok(state, (unsigned long)p, sizeof(unsigned long)))
> + goto err;
> +
> + pc = unwind_graph_addr(state, *p, state->sp);
> + pc -= LOONGARCH_INSN_SIZE;
> + } else if (orc->ra_reg == ORC_REG_UNDEFINED) {
> + if (!state->ra || state->ra == state->pc)
> + goto err;
> +
> + pc = unwind_graph_addr(state, state->ra, state->sp);
> + pc -= LOONGARCH_INSN_SIZE;
> + state->ra = 0;
> + } else {
> + orc_warn("unknown ra base reg %d at %pB\n",
> + orc->ra_reg, (void *)state->pc);
> + goto err;
> + }
> + break;
> + case UNWIND_HINT_TYPE_REGS:
> + if (state->stack_info.type == STACK_TYPE_IRQ && state->sp == info->end)
> + regs = (struct pt_regs *)info->next_sp;
> + else
> + regs = (struct pt_regs *)state->sp;
> +
> + if (!stack_access_ok(state, (unsigned long)regs, sizeof(*regs)))
> + goto err;
> +
> + if ((info->end == (unsigned long)regs + sizeof(*regs)) &&
> + !regs->regs[3] && !regs->regs[1])
> + goto end;
> +
> + if (user_mode(regs))
> + goto end;
> +
> + pc = regs->csr_era;
> + if (!__kernel_text_address(pc))
> + goto err;
> +
> + state->sp = regs->regs[3];
> + state->ra = regs->regs[1];
> + state->fp = regs->regs[22];
> + get_stack_info(state->sp, state->task, info);
> +
> + break;
> + default:
> + orc_warn("unknown .orc_unwind entry type %d at %pB\n",
> + orc->type, (void *)state->pc);
> + goto err;
> + }
> +
> + state->pc = bt_address(pc);
> + if (!state->pc) {
> + pr_err("cannot find unwind pc at %pK\n", (void *)pc);
> + goto err;
> + }
> +
> + if (!__kernel_text_address(state->pc))
> + goto err;
> +
> + preempt_enable();
> + return true;
> +
> +err:
> + state->error = true;
> +
> +end:
> + preempt_enable();
> + state->stack_info.type = STACK_TYPE_UNKNOWN;
> + return false;
> +}
> +EXPORT_SYMBOL_GPL(unwind_next_frame);
> diff --git a/arch/loongarch/kernel/vmlinux.lds.S b/arch/loongarch/kernel/vmlinux.lds.S
> index bb2ec86..09fd4eb 100644
> --- a/arch/loongarch/kernel/vmlinux.lds.S
> +++ b/arch/loongarch/kernel/vmlinux.lds.S
> @@ -2,6 +2,7 @@
> #include <linux/sizes.h>
> #include <asm/asm-offsets.h>
> #include <asm/thread_info.h>
> +#include <asm/orc_lookup.h>
>
> #define PAGE_SIZE _PAGE_SIZE
> #define RO_EXCEPTION_TABLE_ALIGN 4
> @@ -99,6 +100,8 @@ SECTIONS
> _sdata = .;
> RO_DATA(4096)
>
> + ORC_UNWIND_TABLE
> +
> .got : ALIGN(16) { *(.got) }
> .plt : ALIGN(16) { *(.plt) }
> .got.plt : ALIGN(16) { *(.got.plt) }
> diff --git a/arch/loongarch/lib/Makefile b/arch/loongarch/lib/Makefile
> index a77bf160..e3023d9 100644
> --- a/arch/loongarch/lib/Makefile
> +++ b/arch/loongarch/lib/Makefile
> @@ -3,6 +3,8 @@
> # Makefile for LoongArch-specific library files.
> #
>
> +OBJECT_FILES_NON_STANDARD := y
> +
> lib-y += delay.o memset.o memcpy.o memmove.o \
> clear_user.o copy_user.o csum.o dump_tlb.o unaligned.o
>
> diff --git a/arch/loongarch/mm/tlbex.S b/arch/loongarch/mm/tlbex.S
> index ca17dd3..a44387b 100644
> --- a/arch/loongarch/mm/tlbex.S
> +++ b/arch/loongarch/mm/tlbex.S
> @@ -17,7 +17,8 @@
> #define PTRS_PER_PTE_BITS (PAGE_SHIFT - 3)
>
> .macro tlb_do_page_fault, write
> - SYM_FUNC_START(tlb_do_page_fault_\write)
> + SYM_CODE_START(tlb_do_page_fault_\write)
> + UNWIND_HINT_UNDEFINED
> SAVE_ALL
> csrrd a2, LOONGARCH_CSR_BADV
> move a0, sp
> @@ -25,13 +26,14 @@
> li.w a1, \write
> bl do_page_fault
> RESTORE_ALL_AND_RET
> - SYM_FUNC_END(tlb_do_page_fault_\write)
> + SYM_CODE_END(tlb_do_page_fault_\write)
> .endm
>
> tlb_do_page_fault 0
> tlb_do_page_fault 1
>
> -SYM_FUNC_START(handle_tlb_protect)
> +SYM_CODE_START(handle_tlb_protect)
> + UNWIND_HINT_UNDEFINED
> BACKUP_T0T1
> SAVE_ALL
> move a0, sp
> @@ -41,9 +43,10 @@ SYM_FUNC_START(handle_tlb_protect)
> la_abs t0, do_page_fault
> jirl ra, t0, 0
> RESTORE_ALL_AND_RET
> -SYM_FUNC_END(handle_tlb_protect)
> +SYM_CODE_END(handle_tlb_protect)
>
> -SYM_FUNC_START(handle_tlb_load)
> +SYM_CODE_START(handle_tlb_load)
> + UNWIND_HINT_UNDEFINED
> csrwr t0, EXCEPTION_KS0
> csrwr t1, EXCEPTION_KS1
> csrwr ra, EXCEPTION_KS2
> @@ -187,16 +190,18 @@ nopage_tlb_load:
> csrrd ra, EXCEPTION_KS2
> la_abs t0, tlb_do_page_fault_0
> jr t0
> -SYM_FUNC_END(handle_tlb_load)
> +SYM_CODE_END(handle_tlb_load)
>
> -SYM_FUNC_START(handle_tlb_load_ptw)
> +SYM_CODE_START(handle_tlb_load_ptw)
> + UNWIND_HINT_UNDEFINED
> csrwr t0, LOONGARCH_CSR_KS0
> csrwr t1, LOONGARCH_CSR_KS1
> la_abs t0, tlb_do_page_fault_0
> jr t0
> -SYM_FUNC_END(handle_tlb_load_ptw)
> +SYM_CODE_END(handle_tlb_load_ptw)
>
> -SYM_FUNC_START(handle_tlb_store)
> +SYM_CODE_START(handle_tlb_store)
> + UNWIND_HINT_UNDEFINED
> csrwr t0, EXCEPTION_KS0
> csrwr t1, EXCEPTION_KS1
> csrwr ra, EXCEPTION_KS2
> @@ -343,16 +348,18 @@ nopage_tlb_store:
> csrrd ra, EXCEPTION_KS2
> la_abs t0, tlb_do_page_fault_1
> jr t0
> -SYM_FUNC_END(handle_tlb_store)
> +SYM_CODE_END(handle_tlb_store)
>
> -SYM_FUNC_START(handle_tlb_store_ptw)
> +SYM_CODE_START(handle_tlb_store_ptw)
> + UNWIND_HINT_UNDEFINED
> csrwr t0, LOONGARCH_CSR_KS0
> csrwr t1, LOONGARCH_CSR_KS1
> la_abs t0, tlb_do_page_fault_1
> jr t0
> -SYM_FUNC_END(handle_tlb_store_ptw)
> +SYM_CODE_END(handle_tlb_store_ptw)
>
> -SYM_FUNC_START(handle_tlb_modify)
> +SYM_CODE_START(handle_tlb_modify)
> + UNWIND_HINT_UNDEFINED
> csrwr t0, EXCEPTION_KS0
> csrwr t1, EXCEPTION_KS1
> csrwr ra, EXCEPTION_KS2
> @@ -497,16 +504,18 @@ nopage_tlb_modify:
> csrrd ra, EXCEPTION_KS2
> la_abs t0, tlb_do_page_fault_1
> jr t0
> -SYM_FUNC_END(handle_tlb_modify)
> +SYM_CODE_END(handle_tlb_modify)
>
> -SYM_FUNC_START(handle_tlb_modify_ptw)
> +SYM_CODE_START(handle_tlb_modify_ptw)
> + UNWIND_HINT_UNDEFINED
> csrwr t0, LOONGARCH_CSR_KS0
> csrwr t1, LOONGARCH_CSR_KS1
> la_abs t0, tlb_do_page_fault_1
> jr t0
> -SYM_FUNC_END(handle_tlb_modify_ptw)
> +SYM_CODE_END(handle_tlb_modify_ptw)
>
> -SYM_FUNC_START(handle_tlb_refill)
> +SYM_CODE_START(handle_tlb_refill)
> + UNWIND_HINT_UNDEFINED
> csrwr t0, LOONGARCH_CSR_TLBRSAVE
> csrrd t0, LOONGARCH_CSR_PGD
> lddir t0, t0, 3
> @@ -521,4 +530,4 @@ SYM_FUNC_START(handle_tlb_refill)
> tlbfill
> csrrd t0, LOONGARCH_CSR_TLBRSAVE
> ertn
> -SYM_FUNC_END(handle_tlb_refill)
> +SYM_CODE_END(handle_tlb_refill)
> diff --git a/arch/loongarch/power/Makefile b/arch/loongarch/power/Makefile
> index 58151d0..bbd1d47 100644
> --- a/arch/loongarch/power/Makefile
> +++ b/arch/loongarch/power/Makefile
> @@ -1,3 +1,5 @@
> +OBJECT_FILES_NON_STANDARD_suspend_asm.o := y
hibernate_asm.o has no problem?

Huacai
> +
> obj-y += platform.o
>
> obj-$(CONFIG_SUSPEND) += suspend.o suspend_asm.o
> diff --git a/arch/loongarch/vdso/Makefile b/arch/loongarch/vdso/Makefile
> index 5c97d1463..997f41c 100644
> --- a/arch/loongarch/vdso/Makefile
> +++ b/arch/loongarch/vdso/Makefile
> @@ -3,6 +3,7 @@
>
> KASAN_SANITIZE := n
> KCOV_INSTRUMENT := n
> +OBJECT_FILES_NON_STANDARD := y
>
> # Include the generic Makefile to check the built vdso.
> include $(srctree)/lib/vdso/Makefile
> diff --git a/include/linux/compiler.h b/include/linux/compiler.h
> index d7779a1..df29ddb 100644
> --- a/include/linux/compiler.h
> +++ b/include/linux/compiler.h
> @@ -116,6 +116,14 @@ void ftrace_likely_update(struct ftrace_likely_data *f, int val,
> */
> #define __stringify_label(n) #n
>
> +#define __annotate_reachable(c) ({ \
> + asm volatile(__stringify_label(c) ":\n\t" \
> + ".pushsection .discard.reachable\n\t" \
> + ".long " __stringify_label(c) "b - .\n\t" \
> + ".popsection\n\t"); \
> +})
> +#define annotate_reachable() __annotate_reachable(__COUNTER__)
> +
> #define __annotate_unreachable(c) ({ \
> asm volatile(__stringify_label(c) ":\n\t" \
> ".pushsection .discard.unreachable\n\t" \
> @@ -128,6 +136,7 @@ void ftrace_likely_update(struct ftrace_likely_data *f, int val,
> #define __annotate_jump_table __section(".rodata..c_jump_table")
>
> #else /* !CONFIG_OBJTOOL */
> +#define annotate_reachable()
> #define annotate_unreachable()
> #define __annotate_jump_table
> #endif /* CONFIG_OBJTOOL */
> diff --git a/scripts/Makefile b/scripts/Makefile
> index 576cf64..baaed78 100644
> --- a/scripts/Makefile
> +++ b/scripts/Makefile
> @@ -33,7 +33,10 @@ ifdef CONFIG_UNWINDER_ORC
> ifeq ($(ARCH),x86_64)
> ARCH := x86
> endif
> -HOSTCFLAGS_sorttable.o += -I$(srctree)/tools/arch/x86/include
> +ifeq ($(ARCH),loongarch)
> +ARCH := loongarch
> +endif
> +HOSTCFLAGS_sorttable.o += -I$(srctree)/tools/arch/$(ARCH)/include
> HOSTCFLAGS_sorttable.o += -DUNWINDER_ORC_ENABLED
> endif
>
> --
> 2.1.0
>
>

2023-10-14 02:23:15

by Tiezhu Yang

[permalink] [raw]
Subject: Re: [PATCH v2 1/8] objtool/LoongArch: Enable objtool to be built



On 10/10/2023 08:45 PM, Huacai Chen wrote:
> Hi, Tiezhu,
>
> On Mon, Oct 9, 2023 at 9:03 PM Tiezhu Yang <[email protected]> wrote:
>>
>> Add the minimal changes to enable objtool build on LoongArch,
>> most of the functions are stubs to only fix the build errors
>> when make -C tools/objtool.
>>
>> This is similar with commit e52ec98c5ab1 ("objtool/powerpc:
>> Enable objtool to be built on ppc").

...

>> diff --git a/tools/objtool/arch/loongarch/include/arch/special.h b/tools/objtool/arch/loongarch/include/arch/special.h
>> new file mode 100644
>> index 0000000..1a8245c
>> --- /dev/null
>> +++ b/tools/objtool/arch/loongarch/include/arch/special.h
>> @@ -0,0 +1,33 @@
>> +/* SPDX-License-Identifier: GPL-2.0-or-later */
>> +#ifndef _OBJTOOL_ARCH_SPECIAL_H
>> +#define _OBJTOOL_ARCH_SPECIAL_H
>> +
>> +/*
>> + * See more info about struct exception_table_entry
>> + * in arch/loongarch/include/asm/extable.h
>> + */
>> +#define EX_ENTRY_SIZE 12
>> +#define EX_ORIG_OFFSET 0
>> +#define EX_NEW_OFFSET 4
> Other archs use tab for indentation in special.h
>

OK, thank you, will do it.

Thanks,
Tiezhu

2023-10-14 03:40:40

by Tiezhu Yang

[permalink] [raw]
Subject: Re: [PATCH v2 4/8] objtool/LoongArch: Enable orc to be built



On 10/10/2023 08:52 PM, Huacai Chen wrote:
> Hi, Tiezhu,
>
> On Mon, Oct 9, 2023 at 9:03 PM Tiezhu Yang <[email protected]> wrote:
>>
>> Implement arch-specific init_orc_entry(), reg_name(), orc_type_name(),
>> print_reg() and orc_print_dump(), then set BUILD_ORC as y to build the
>> orc related files.

...

>> +#define ORC_REG_SP 2
>> +#define ORC_REG_BP 3
> There is no BP register for LoongArch, so I think all 'BP' should be
> 'FP' in this patch.

Makes sense, thank you, will do it.

>> +#define ORC_REG_MAX 4

...

>> +struct orc_entry {
>> + s16 sp_offset;
>> + s16 bp_offset;
>> + s16 ra_offset;
>> + unsigned int sp_reg:4;
>> + unsigned int bp_reg:4;
>> + unsigned int ra_reg:4;
>> + unsigned int type:3;
>> + unsigned int signal:1;
>> +};

At the same time, I will replace bp_offset with fp_offset and
replace bp_reg with fp_reg, then modify the related code.

Thanks,
Tiezhu

2023-10-14 09:21:41

by Tiezhu Yang

[permalink] [raw]
Subject: Re: [PATCH v2 8/8] LoongArch: Add ORC unwinder support



On 10/11/2023 12:37 PM, Huacai Chen wrote:
> Hi, Tiezhu,
>
> Maybe "LoongArch: Add ORC stack unwinder support" is better.

OK, will modify it.

>
> On Mon, Oct 9, 2023 at 9:03 PM Tiezhu Yang <[email protected]> wrote:
>>
>> The kernel CONFIG_UNWINDER_ORC option enables the ORC unwinder, which is
>> similar in concept to a DWARF unwinder. The difference is that the format
>> of the ORC data is much simpler than DWARF, which in turn allows the ORC
>> unwinder to be much simpler and faster.

...

>> +ifdef CONFIG_OBJTOOL
>> +# https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=ecb802d02eeb
>> +# https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=816029e06768
>> +ifeq ($(shell as --help 2>&1 | grep -e '-mthin-add-sub'),)
>> + $(error Sorry, you need a newer gas version with -mthin-add-sub option)
> I prefer no error out here, because without this option we can still
> built a runnable kernel.

I agree with you that it is better to not error out to stop compilation,
but there are many objtool warnings during the compile process with old
binutils, so it is necessary to give a warning so that the users know
what happened and how to fix the lots of objtool warnings.

That is to say, I would prefer to replace "error" with "warning".

>> +endif
>> +KBUILD_AFLAGS += $(call cc-option,-mthin-add-sub) $(call cc-option,-Wa$(comma)-mthin-add-sub)
>> +KBUILD_CFLAGS += $(call cc-option,-mthin-add-sub) $(call cc-option,-Wa$(comma)-mthin-add-sub)
>> +KBUILD_CFLAGS += -fno-optimize-sibling-calls -fno-jump-tables -falign-functions=4
>> +endif

...

>> +#define ORC_REG_BP 3
> Use FP instead of BP in this patch, too.

OK, will do it.

>
>> +#define ORC_REG_MAX 4

...

>> +.macro UNWIND_HINT_UNDEFINED
>
>> + UNWIND_HINT type=UNWIND_HINT_TYPE_UNDEFINED
>> +.endm
> We don't need to set sp_reg=ORC_REG_UNDEFINED for UNWIND_HINT_UNDEFINED?

Yes, no need to set sp_reg, the instructions marked with UNDEFINED
are blind spots in ORC coverage, it is no related with stack trace,
this is similar with x86.

>
>> +
>> +.macro UNWIND_HINT_EMPTY
>> + UNWIND_HINT sp_reg=ORC_REG_UNDEFINED type=UNWIND_HINT_TYPE_CALL
>> +.endm
> We don't need to define UNWIND_HINT_END_OF_STACK?

Yes, it is useless now.

>
>> +
>> +.macro UNWIND_HINT_REGS
>> + UNWIND_HINT sp_reg=ORC_REG_SP type=UNWIND_HINT_TYPE_REGS
>> +.endm
>> +
>> +.macro UNWIND_HINT_FUNC
>> + UNWIND_HINT sp_reg=ORC_REG_SP type=UNWIND_HINT_TYPE_CALL
>> +.endm
> We don't need to set sp_offset for UNWIND_HINT_REGS and UNWIND_HINT_FUNC?

sp_offset is 0 by default, no need to set it unless you need to change
its value, see include/linux/objtool.h
.macro UNWIND_HINT type:req sp_reg=0 sp_offset=0 signal=0

>
>> +
>> +#endif /* __ASSEMBLY__ */

...

>> diff --git a/arch/loongarch/kernel/entry.S b/arch/loongarch/kernel/entry.S
>> index 65518bb..e43115f 100644
>> --- a/arch/loongarch/kernel/entry.S
>> +++ b/arch/loongarch/kernel/entry.S
>> @@ -14,11 +14,13 @@
>> #include <asm/regdef.h>
>> #include <asm/stackframe.h>
>> #include <asm/thread_info.h>
>> +#include <asm/unwind_hints.h>
>>
>> .text
>> .cfi_sections .debug_frame
>> .align 5
>> -SYM_FUNC_START(handle_syscall)
>> +SYM_CODE_START(handle_syscall)
> Why?
>

see include/linux/linkage.h
FUNC -- C-like functions (proper stack frame etc.)
CODE -- non-C code (e.g. irq handlers with different, special stack etc.)

>> + UNWIND_HINT_UNDEFINED
>> csrrd t0, PERCPU_BASE_KS

...

>> diff --git a/arch/loongarch/kernel/head.S b/arch/loongarch/kernel/head.S
>> index 53b883d..5664390 100644
>> --- a/arch/loongarch/kernel/head.S
>> +++ b/arch/loongarch/kernel/head.S
>> @@ -43,6 +43,7 @@ SYM_DATA(kernel_offset, .long _kernel_offset);
>> .align 12
>>
>> SYM_CODE_START(kernel_entry) # kernel entry point
>> + UNWIND_HINT_EMPTY
> I'm not sure but I think this isn't needed, because
> "OBJECT_FILES_NON_STANDARD_head.o :=y"

Yes, you are right, will remove it.

>
>>
>> /* Config direct window and set PG */

...

>> void __init setup_arch(char **cmdline_p)
>> {
>> + unwind_init();
> I think this line should be after cpu_probe().

I am OK to do this change, but if so, there are no stack trace before
cpu_probe() for the early code.

>
>> cpu_probe();
>>
>> init_environ();

...

>> diff --git a/arch/loongarch/power/Makefile b/arch/loongarch/power/Makefile
>> index 58151d0..bbd1d47 100644
>> --- a/arch/loongarch/power/Makefile
>> +++ b/arch/loongarch/power/Makefile
>> @@ -1,3 +1,5 @@
>> +OBJECT_FILES_NON_STANDARD_suspend_asm.o := y
> hibernate_asm.o has no problem?

Yes, only suspend_asm.o has one warning, just ignore it.

Thanks,
Tiezhu

2023-10-14 11:39:07

by Huacai Chen

[permalink] [raw]
Subject: Re: [PATCH v2 8/8] LoongArch: Add ORC unwinder support

+CC Jinyang

On Sat, Oct 14, 2023 at 5:21 PM Tiezhu Yang <[email protected]> wrote:
>
>
>
> On 10/11/2023 12:37 PM, Huacai Chen wrote:
> > Hi, Tiezhu,
> >
> > Maybe "LoongArch: Add ORC stack unwinder support" is better.
>
> OK, will modify it.
>
> >
> > On Mon, Oct 9, 2023 at 9:03 PM Tiezhu Yang <[email protected]> wrote:
> >>
> >> The kernel CONFIG_UNWINDER_ORC option enables the ORC unwinder, which is
> >> similar in concept to a DWARF unwinder. The difference is that the format
> >> of the ORC data is much simpler than DWARF, which in turn allows the ORC
> >> unwinder to be much simpler and faster.
>
> ...
>
> >> +ifdef CONFIG_OBJTOOL
> >> +# https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=ecb802d02eeb
> >> +# https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=816029e06768
> >> +ifeq ($(shell as --help 2>&1 | grep -e '-mthin-add-sub'),)
> >> + $(error Sorry, you need a newer gas version with -mthin-add-sub option)
> > I prefer no error out here, because without this option we can still
> > built a runnable kernel.
>
> I agree with you that it is better to not error out to stop compilation,
> but there are many objtool warnings during the compile process with old
> binutils, so it is necessary to give a warning so that the users know
> what happened and how to fix the lots of objtool warnings.
>
> That is to say, I would prefer to replace "error" with "warning".
>
> >> +endif
> >> +KBUILD_AFLAGS += $(call cc-option,-mthin-add-sub) $(call cc-option,-Wa$(comma)-mthin-add-sub)
> >> +KBUILD_CFLAGS += $(call cc-option,-mthin-add-sub) $(call cc-option,-Wa$(comma)-mthin-add-sub)
> >> +KBUILD_CFLAGS += -fno-optimize-sibling-calls -fno-jump-tables -falign-functions=4
> >> +endif
>
> ...
>
> >> +#define ORC_REG_BP 3
> > Use FP instead of BP in this patch, too.
>
> OK, will do it.
>
> >
> >> +#define ORC_REG_MAX 4
>
> ...
>
> >> +.macro UNWIND_HINT_UNDEFINED
> >
> >> + UNWIND_HINT type=UNWIND_HINT_TYPE_UNDEFINED
> >> +.endm
> > We don't need to set sp_reg=ORC_REG_UNDEFINED for UNWIND_HINT_UNDEFINED?
>
> Yes, no need to set sp_reg, the instructions marked with UNDEFINED
> are blind spots in ORC coverage, it is no related with stack trace,
> this is similar with x86.
>
> >
> >> +
> >> +.macro UNWIND_HINT_EMPTY
> >> + UNWIND_HINT sp_reg=ORC_REG_UNDEFINED type=UNWIND_HINT_TYPE_CALL
> >> +.endm
> > We don't need to define UNWIND_HINT_END_OF_STACK?
>
> Yes, it is useless now.
>
> >
> >> +
> >> +.macro UNWIND_HINT_REGS
> >> + UNWIND_HINT sp_reg=ORC_REG_SP type=UNWIND_HINT_TYPE_REGS
> >> +.endm
> >> +
> >> +.macro UNWIND_HINT_FUNC
> >> + UNWIND_HINT sp_reg=ORC_REG_SP type=UNWIND_HINT_TYPE_CALL
> >> +.endm
> > We don't need to set sp_offset for UNWIND_HINT_REGS and UNWIND_HINT_FUNC?
>
> sp_offset is 0 by default, no need to set it unless you need to change
> its value, see include/linux/objtool.h
> .macro UNWIND_HINT type:req sp_reg=0 sp_offset=0 signal=0
>
> >
> >> +
> >> +#endif /* __ASSEMBLY__ */
>
> ...
>
> >> diff --git a/arch/loongarch/kernel/entry.S b/arch/loongarch/kernel/entry.S
> >> index 65518bb..e43115f 100644
> >> --- a/arch/loongarch/kernel/entry.S
> >> +++ b/arch/loongarch/kernel/entry.S
> >> @@ -14,11 +14,13 @@
> >> #include <asm/regdef.h>
> >> #include <asm/stackframe.h>
> >> #include <asm/thread_info.h>
> >> +#include <asm/unwind_hints.h>
> >>
> >> .text
> >> .cfi_sections .debug_frame
> >> .align 5
> >> -SYM_FUNC_START(handle_syscall)
> >> +SYM_CODE_START(handle_syscall)
> > Why?
> >
>
> see include/linux/linkage.h
> FUNC -- C-like functions (proper stack frame etc.)
> CODE -- non-C code (e.g. irq handlers with different, special stack etc.)
Hi, Jinyang,

What do you think about it? In our internal repo, most asm functions
changed in this patch are still marked with FUNC, not CODE.

>
> >> + UNWIND_HINT_UNDEFINED
> >> csrrd t0, PERCPU_BASE_KS
>
> ...
>
> >> diff --git a/arch/loongarch/kernel/head.S b/arch/loongarch/kernel/head.S
> >> index 53b883d..5664390 100644
> >> --- a/arch/loongarch/kernel/head.S
> >> +++ b/arch/loongarch/kernel/head.S
> >> @@ -43,6 +43,7 @@ SYM_DATA(kernel_offset, .long _kernel_offset);
> >> .align 12
> >>
> >> SYM_CODE_START(kernel_entry) # kernel entry point
> >> + UNWIND_HINT_EMPTY
> > I'm not sure but I think this isn't needed, because
> > "OBJECT_FILES_NON_STANDARD_head.o :=y"
>
> Yes, you are right, will remove it.
>
> >
> >>
> >> /* Config direct window and set PG */
>
> ...
>
> >> void __init setup_arch(char **cmdline_p)
> >> {
> >> + unwind_init();
> > I think this line should be after cpu_probe().
>
> I am OK to do this change, but if so, there are no stack trace before
> cpu_probe() for the early code.
As I said before, stack trace needs printk, but printk cannot work
before cpu_probe().

>
> >
> >> cpu_probe();
> >>
> >> init_environ();
>
> ...
>
> >> diff --git a/arch/loongarch/power/Makefile b/arch/loongarch/power/Makefile
> >> index 58151d0..bbd1d47 100644
> >> --- a/arch/loongarch/power/Makefile
> >> +++ b/arch/loongarch/power/Makefile
> >> @@ -1,3 +1,5 @@
> >> +OBJECT_FILES_NON_STANDARD_suspend_asm.o := y
> > hibernate_asm.o has no problem?
>
> Yes, only suspend_asm.o has one warning, just ignore it.
What kind of warning? When I submitted the suspend patch, Jinyang told
me that with his changes loongarch_suspend_enter() can be a regular
function.

Huacai
>
> Thanks,
> Tiezhu
>
>

2023-10-14 11:42:11

by Huacai Chen

[permalink] [raw]
Subject: Re: [PATCH v2 8/8] LoongArch: Add ORC unwinder support

On Mon, Oct 9, 2023 at 9:03 PM Tiezhu Yang <[email protected]> wrote:
>
> The kernel CONFIG_UNWINDER_ORC option enables the ORC unwinder, which is
> similar in concept to a DWARF unwinder. The difference is that the format
> of the ORC data is much simpler than DWARF, which in turn allows the ORC
> unwinder to be much simpler and faster.
>
> The ORC data consists of unwind tables which are generated by objtool.
> They contain out-of-band data which is used by the in-kernel ORC unwinder.
> Objtool generates the ORC data by first doing compile-time stack metadata
> validation (CONFIG_STACK_VALIDATION). After analyzing all the code paths
> of a .o file, it determines information about the stack state at each
> instruction address in the file and outputs that information to the
> .orc_unwind and .orc_unwind_ip sections.
>
> The per-object ORC sections are combined at link time and are sorted and
> post-processed at boot time. The unwinder uses the resulting data to
> correlate instruction addresses with their stack states at run time.
>
> Most of the logic are similar with x86, in order to get ra info before ra
> is saved into stack, add ra_reg and ra_offset into orc_entry. At the same
> time, modify some arch-specific code to silence the objtool warnings.
>
> Co-developed-by: Jinyang He <[email protected]>
> Signed-off-by: Jinyang He <[email protected]>
> Co-developed-by: Youling Tang <[email protected]>
> Signed-off-by: Youling Tang <[email protected]>
> Signed-off-by: Tiezhu Yang <[email protected]>
> ---
> arch/loongarch/Kconfig | 2 +
> arch/loongarch/Kconfig.debug | 11 +
> arch/loongarch/Makefile | 23 ++
> arch/loongarch/configs/loongson3_defconfig | 1 +
> arch/loongarch/include/asm/Kbuild | 1 +
> arch/loongarch/include/asm/bug.h | 1 +
> arch/loongarch/include/asm/linkage.h | 2 +
> arch/loongarch/include/asm/module.h | 7 +
> arch/loongarch/include/asm/orc_header.h | 19 +
> arch/loongarch/include/asm/orc_lookup.h | 34 ++
> arch/loongarch/include/asm/orc_types.h | 58 +++
> arch/loongarch/include/asm/stackframe.h | 3 +
> arch/loongarch/include/asm/unwind.h | 22 +-
> arch/loongarch/include/asm/unwind_hints.h | 28 ++
> arch/loongarch/kernel/Makefile | 3 +
> arch/loongarch/kernel/entry.S | 9 +-
> arch/loongarch/kernel/genex.S | 20 +-
> arch/loongarch/kernel/head.S | 1 +
> arch/loongarch/kernel/module.c | 11 +-
> arch/loongarch/kernel/relocate_kernel.S | 2 +
> arch/loongarch/kernel/setup.c | 2 +
> arch/loongarch/kernel/stacktrace.c | 1 +
> arch/loongarch/kernel/unwind_orc.c | 571 +++++++++++++++++++++++++++++
> arch/loongarch/kernel/vmlinux.lds.S | 3 +
> arch/loongarch/lib/Makefile | 2 +
> arch/loongarch/mm/tlbex.S | 45 ++-
> arch/loongarch/power/Makefile | 2 +
> arch/loongarch/vdso/Makefile | 1 +
> include/linux/compiler.h | 9 +
> scripts/Makefile | 5 +-
> 30 files changed, 867 insertions(+), 32 deletions(-)
> create mode 100644 arch/loongarch/include/asm/orc_header.h
> create mode 100644 arch/loongarch/include/asm/orc_lookup.h
> create mode 100644 arch/loongarch/include/asm/orc_types.h
> create mode 100644 arch/loongarch/include/asm/unwind_hints.h
> create mode 100644 arch/loongarch/kernel/unwind_orc.c
>
> diff --git a/arch/loongarch/Kconfig b/arch/loongarch/Kconfig
> index e14396a..21ef3bb 100644
> --- a/arch/loongarch/Kconfig
> +++ b/arch/loongarch/Kconfig
> @@ -131,6 +131,7 @@ config LOONGARCH
> select HAVE_KRETPROBES
> select HAVE_MOD_ARCH_SPECIFIC
> select HAVE_NMI
> + select HAVE_OBJTOOL if AS_HAS_EXPLICIT_RELOCS
> select HAVE_PCI
> select HAVE_PERF_EVENTS
> select HAVE_PERF_REGS
> @@ -141,6 +142,7 @@ config LOONGARCH
> select HAVE_SAMPLE_FTRACE_DIRECT
> select HAVE_SAMPLE_FTRACE_DIRECT_MULTI
> select HAVE_SETUP_PER_CPU_AREA if NUMA
> + select HAVE_STACK_VALIDATION if HAVE_OBJTOOL
> select HAVE_STACKPROTECTOR
> select HAVE_SYSCALL_TRACEPOINTS
> select HAVE_TIF_NOHZ
> diff --git a/arch/loongarch/Kconfig.debug b/arch/loongarch/Kconfig.debug
> index 8d36aab..98d6063 100644
> --- a/arch/loongarch/Kconfig.debug
> +++ b/arch/loongarch/Kconfig.debug
> @@ -26,4 +26,15 @@ config UNWINDER_PROLOGUE
> Some of the addresses it reports may be incorrect (but better than the
> Guess unwinder).
>
> +config UNWINDER_ORC
> + bool "ORC unwinder"
> + select OBJTOOL
> + help
> + This option enables the ORC (Oops Rewind Capability) unwinder for
> + unwinding kernel stack traces. It uses a custom data format which is
> + a simplified version of the DWARF Call Frame Information standard.
> +
> + Enabling this option will increase the kernel's runtime memory usage
> + by roughly 2-4MB, depending on your kernel config.
> +
> endchoice
> diff --git a/arch/loongarch/Makefile b/arch/loongarch/Makefile
> index fb0fada..89a6e61 100644
> --- a/arch/loongarch/Makefile
> +++ b/arch/loongarch/Makefile
> @@ -25,6 +25,29 @@ endif
> 32bit-emul = elf32loongarch
> 64bit-emul = elf64loongarch
>
> +ifdef CONFIG_OBJTOOL
> +# https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=ecb802d02eeb
> +# https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=816029e06768
> +ifeq ($(shell as --help 2>&1 | grep -e '-mthin-add-sub'),)
> + $(error Sorry, you need a newer gas version with -mthin-add-sub option)
> +endif
> +KBUILD_AFLAGS += $(call cc-option,-mthin-add-sub) $(call cc-option,-Wa$(comma)-mthin-add-sub)
> +KBUILD_CFLAGS += $(call cc-option,-mthin-add-sub) $(call cc-option,-Wa$(comma)-mthin-add-sub)
> +KBUILD_CFLAGS += -fno-optimize-sibling-calls -fno-jump-tables -falign-functions=4
> +endif
> +
> +ifdef CONFIG_UNWINDER_ORC
> +orc_hash_h := arch/$(SRCARCH)/include/generated/asm/orc_hash.h
> +orc_hash_sh := $(srctree)/scripts/orc_hash.sh
> +targets += $(orc_hash_h)
> +quiet_cmd_orc_hash = GEN $@
> + cmd_orc_hash = mkdir -p $(dir $@); \
> + $(CONFIG_SHELL) $(orc_hash_sh) < $< > $@
> +$(orc_hash_h): $(srctree)/arch/loongarch/include/asm/orc_types.h $(orc_hash_sh) FORCE
> + $(call if_changed,orc_hash)
> +archprepare: $(orc_hash_h)
> +endif
> +
> ifdef CONFIG_DYNAMIC_FTRACE
> KBUILD_CPPFLAGS += -DCC_USING_PATCHABLE_FUNCTION_ENTRY
> CC_FLAGS_FTRACE := -fpatchable-function-entry=2
> diff --git a/arch/loongarch/configs/loongson3_defconfig b/arch/loongarch/configs/loongson3_defconfig
> index a3b52aa..de911c3 100644
> --- a/arch/loongarch/configs/loongson3_defconfig
> +++ b/arch/loongarch/configs/loongson3_defconfig
> @@ -5,6 +5,7 @@ CONFIG_NO_HZ=y
> CONFIG_HIGH_RES_TIMERS=y
> CONFIG_BPF_SYSCALL=y
> CONFIG_BPF_JIT=y
> +CONFIG_BPF_JIT_ALWAYS_ON=y
Does BPF have something to do with ORC?

Huacai

> CONFIG_PREEMPT=y
> CONFIG_BSD_PROCESS_ACCT=y
> CONFIG_BSD_PROCESS_ACCT_V3=y
> diff --git a/arch/loongarch/include/asm/Kbuild b/arch/loongarch/include/asm/Kbuild
> index 93783fa..2bb285c 100644
> --- a/arch/loongarch/include/asm/Kbuild
> +++ b/arch/loongarch/include/asm/Kbuild
> @@ -1,4 +1,5 @@
> # SPDX-License-Identifier: GPL-2.0
> +generated-y += orc_hash.h
> generic-y += dma-contiguous.h
> generic-y += mcs_spinlock.h
> generic-y += parport.h
> diff --git a/arch/loongarch/include/asm/bug.h b/arch/loongarch/include/asm/bug.h
> index d4ca3ba..0838887 100644
> --- a/arch/loongarch/include/asm/bug.h
> +++ b/arch/loongarch/include/asm/bug.h
> @@ -44,6 +44,7 @@
> do { \
> instrumentation_begin(); \
> __BUG_FLAGS(BUGFLAG_WARNING|(flags)); \
> + annotate_reachable(); \
> instrumentation_end(); \
> } while (0)
>
> diff --git a/arch/loongarch/include/asm/linkage.h b/arch/loongarch/include/asm/linkage.h
> index 81b0c4c..ae4e100 100644
> --- a/arch/loongarch/include/asm/linkage.h
> +++ b/arch/loongarch/include/asm/linkage.h
> @@ -2,6 +2,8 @@
> #ifndef __ASM_LINKAGE_H
> #define __ASM_LINKAGE_H
>
> +#include <asm/unwind_hints.h>
> +
> #define __ALIGN .align 2
> #define __ALIGN_STR __stringify(__ALIGN)
>
> diff --git a/arch/loongarch/include/asm/module.h b/arch/loongarch/include/asm/module.h
> index 2ecd82b..96af0ba 100644
> --- a/arch/loongarch/include/asm/module.h
> +++ b/arch/loongarch/include/asm/module.h
> @@ -6,6 +6,7 @@
> #define _ASM_MODULE_H
>
> #include <asm/inst.h>
> +#include <asm/orc_types.h>
> #include <asm-generic/module.h>
>
> #define RELA_STACK_DEPTH 16
> @@ -23,6 +24,12 @@ struct mod_arch_specific {
>
> /* For CONFIG_DYNAMIC_FTRACE */
> struct plt_entry *ftrace_trampolines;
> +
> +#ifdef CONFIG_UNWINDER_ORC
> + unsigned int num_orcs;
> + int *orc_unwind_ip;
> + struct orc_entry *orc_unwind;
> +#endif
> };
>
> struct got_entry {
> diff --git a/arch/loongarch/include/asm/orc_header.h b/arch/loongarch/include/asm/orc_header.h
> new file mode 100644
> index 0000000..07bacf3
> --- /dev/null
> +++ b/arch/loongarch/include/asm/orc_header.h
> @@ -0,0 +1,19 @@
> +/* SPDX-License-Identifier: GPL-2.0-or-later */
> +/* Copyright (c) Meta Platforms, Inc. and affiliates. */
> +
> +#ifndef _ORC_HEADER_H
> +#define _ORC_HEADER_H
> +
> +#include <linux/types.h>
> +#include <linux/compiler.h>
> +#include <asm/orc_hash.h>
> +
> +/*
> + * The header is currently a 20-byte hash of the ORC entry definition; see
> + * scripts/orc_hash.sh.
> + */
> +#define ORC_HEADER \
> + __used __section(".orc_header") __aligned(4) \
> + static const u8 orc_header[] = { ORC_HASH }
> +
> +#endif /* _ORC_HEADER_H */
> diff --git a/arch/loongarch/include/asm/orc_lookup.h b/arch/loongarch/include/asm/orc_lookup.h
> new file mode 100644
> index 0000000..2416312
> --- /dev/null
> +++ b/arch/loongarch/include/asm/orc_lookup.h
> @@ -0,0 +1,34 @@
> +/* SPDX-License-Identifier: GPL-2.0-or-later */
> +/*
> + * Copyright (C) 2017 Josh Poimboeuf <[email protected]>
> + */
> +#ifndef _ORC_LOOKUP_H
> +#define _ORC_LOOKUP_H
> +
> +/*
> + * This is a lookup table for speeding up access to the .orc_unwind table.
> + * Given an input address offset, the corresponding lookup table entry
> + * specifies a subset of the .orc_unwind table to search.
> + *
> + * Each block represents the end of the previous range and the start of the
> + * next range. An extra block is added to give the last range an end.
> + *
> + * The block size should be a power of 2 to avoid a costly 'div' instruction.
> + *
> + * A block size of 256 was chosen because it roughly doubles unwinder
> + * performance while only adding ~5% to the ORC data footprint.
> + */
> +#define LOOKUP_BLOCK_ORDER 8
> +#define LOOKUP_BLOCK_SIZE (1 << LOOKUP_BLOCK_ORDER)
> +
> +#ifndef LINKER_SCRIPT
> +
> +extern unsigned int orc_lookup[];
> +extern unsigned int orc_lookup_end[];
> +
> +#define LOOKUP_START_IP (unsigned long)_stext
> +#define LOOKUP_STOP_IP (unsigned long)_etext
> +
> +#endif /* LINKER_SCRIPT */
> +
> +#endif /* _ORC_LOOKUP_H */
> diff --git a/arch/loongarch/include/asm/orc_types.h b/arch/loongarch/include/asm/orc_types.h
> new file mode 100644
> index 0000000..1d37e62
> --- /dev/null
> +++ b/arch/loongarch/include/asm/orc_types.h
> @@ -0,0 +1,58 @@
> +/* SPDX-License-Identifier: GPL-2.0-or-later */
> +#ifndef _ORC_TYPES_H
> +#define _ORC_TYPES_H
> +
> +#include <linux/types.h>
> +
> +/*
> + * The ORC_REG_* registers are base registers which are used to find other
> + * registers on the stack.
> + *
> + * ORC_REG_PREV_SP, also known as DWARF Call Frame Address (CFA), is the
> + * address of the previous frame: the caller's SP before it called the current
> + * function.
> + *
> + * ORC_REG_UNDEFINED means the corresponding register's value didn't change in
> + * the current frame.
> + *
> + * The most commonly used base registers are SP and BP -- which the previous SP
> + * is usually based on -- and PREV_SP and UNDEFINED -- which the previous BP is
> + * usually based on.
> + *
> + * The rest of the base registers are needed for special cases like entry code
> + * and GCC realigned stacks.
> + */
> +#define ORC_REG_UNDEFINED 0
> +#define ORC_REG_PREV_SP 1
> +#define ORC_REG_SP 2
> +#define ORC_REG_BP 3
> +#define ORC_REG_MAX 4
> +
> +#define ORC_TYPE_UNDEFINED 0
> +#define ORC_TYPE_END_OF_STACK 1
> +#define ORC_TYPE_CALL 2
> +#define ORC_TYPE_REGS 3
> +#define ORC_TYPE_REGS_PARTIAL 4
> +
> +#ifndef __ASSEMBLY__
> +/*
> + * This struct is more or less a vastly simplified version of the DWARF Call
> + * Frame Information standard. It contains only the necessary parts of DWARF
> + * CFI, simplified for ease of access by the in-kernel unwinder. It tells the
> + * unwinder how to find the previous SP and BP (and sometimes entry regs) on
> + * the stack for a given code address. Each instance of the struct corresponds
> + * to one or more code locations.
> + */
> +struct orc_entry {
> + s16 sp_offset;
> + s16 bp_offset;
> + s16 ra_offset;
> + unsigned int sp_reg:4;
> + unsigned int bp_reg:4;
> + unsigned int ra_reg:4;
> + unsigned int type:3;
> + unsigned int signal:1;
> +};
> +#endif /* __ASSEMBLY__ */
> +
> +#endif /* _ORC_TYPES_H */
> diff --git a/arch/loongarch/include/asm/stackframe.h b/arch/loongarch/include/asm/stackframe.h
> index 4fb1e64..45b507a 100644
> --- a/arch/loongarch/include/asm/stackframe.h
> +++ b/arch/loongarch/include/asm/stackframe.h
> @@ -13,6 +13,7 @@
> #include <asm/asm-offsets.h>
> #include <asm/loongarch.h>
> #include <asm/thread_info.h>
> +#include <asm/unwind_hints.h>
>
> /* Make the addition of cfi info a little easier. */
> .macro cfi_rel_offset reg offset=0 docfi=0
> @@ -162,6 +163,7 @@
> li.w t0, CSR_CRMD_WE
> csrxchg t0, t0, LOONGARCH_CSR_CRMD
> #endif
> + UNWIND_HINT_REGS
> .endm
>
> .macro SAVE_ALL docfi=0
> @@ -219,6 +221,7 @@
>
> .macro RESTORE_SP_AND_RET docfi=0
> cfi_ld sp, PT_R3, \docfi
> + UNWIND_HINT_FUNC
> ertn
> .endm
>
> diff --git a/arch/loongarch/include/asm/unwind.h b/arch/loongarch/include/asm/unwind.h
> index b9dce87..d36e04e 100644
> --- a/arch/loongarch/include/asm/unwind.h
> +++ b/arch/loongarch/include/asm/unwind.h
> @@ -16,6 +16,7 @@
> enum unwinder_type {
> UNWINDER_GUESS,
> UNWINDER_PROLOGUE,
> + UNWINDER_ORC,
> };
>
> struct unwind_state {
> @@ -24,7 +25,7 @@ struct unwind_state {
> struct task_struct *task;
> bool first, error, reset;
> int graph_idx;
> - unsigned long sp, pc, ra;
> + unsigned long sp, pc, ra, fp;
> };
>
> bool default_next_frame(struct unwind_state *state);
> @@ -34,6 +35,17 @@ void unwind_start(struct unwind_state *state,
> bool unwind_next_frame(struct unwind_state *state);
> unsigned long unwind_get_return_address(struct unwind_state *state);
>
> +#ifdef CONFIG_UNWINDER_ORC
> +void unwind_init(void);
> +void unwind_module_init(struct module *mod, void *orc_ip, size_t orc_ip_size,
> + void *orc, size_t orc_size);
> +#else
> +static inline void unwind_init(void) {}
> +static inline
> +void unwind_module_init(struct module *mod, void *orc_ip, size_t orc_ip_size,
> + void *orc, size_t orc_size) {}
> +#endif
> +
> static inline bool unwind_done(struct unwind_state *state)
> {
> return state->stack_info.type == STACK_TYPE_UNKNOWN;
> @@ -61,14 +73,17 @@ static __always_inline void __unwind_start(struct unwind_state *state,
> state->sp = regs->regs[3];
> state->pc = regs->csr_era;
> state->ra = regs->regs[1];
> + state->fp = regs->regs[22];
> } else if (task && task != current) {
> state->sp = thread_saved_fp(task);
> state->pc = thread_saved_ra(task);
> state->ra = 0;
> + state->fp = 0;
> } else {
> state->sp = (unsigned long)__builtin_frame_address(0);
> state->pc = (unsigned long)__builtin_return_address(0);
> state->ra = 0;
> + state->fp = 0;
> }
> state->task = task;
> get_stack_info(state->sp, state->task, &state->stack_info);
> @@ -77,6 +92,9 @@ static __always_inline void __unwind_start(struct unwind_state *state,
>
> static __always_inline unsigned long __unwind_get_return_address(struct unwind_state *state)
> {
> - return unwind_done(state) ? 0 : state->pc;
> + if (unwind_done(state))
> + return 0;
> +
> + return __kernel_text_address(state->pc) ? state->pc : 0;
> }
> #endif /* _ASM_UNWIND_H */
> diff --git a/arch/loongarch/include/asm/unwind_hints.h b/arch/loongarch/include/asm/unwind_hints.h
> new file mode 100644
> index 0000000..82443fe
> --- /dev/null
> +++ b/arch/loongarch/include/asm/unwind_hints.h
> @@ -0,0 +1,28 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +#ifndef _ASM_LOONGARCH_UNWIND_HINTS_H
> +#define _ASM_LOONGARCH_UNWIND_HINTS_H
> +
> +#include <linux/objtool.h>
> +#include <asm/orc_types.h>
> +
> +#ifdef __ASSEMBLY__
> +
> +.macro UNWIND_HINT_UNDEFINED
> + UNWIND_HINT type=UNWIND_HINT_TYPE_UNDEFINED
> +.endm
> +
> +.macro UNWIND_HINT_EMPTY
> + UNWIND_HINT sp_reg=ORC_REG_UNDEFINED type=UNWIND_HINT_TYPE_CALL
> +.endm
> +
> +.macro UNWIND_HINT_REGS
> + UNWIND_HINT sp_reg=ORC_REG_SP type=UNWIND_HINT_TYPE_REGS
> +.endm
> +
> +.macro UNWIND_HINT_FUNC
> + UNWIND_HINT sp_reg=ORC_REG_SP type=UNWIND_HINT_TYPE_CALL
> +.endm
> +
> +#endif /* __ASSEMBLY__ */
> +
> +#endif /* _ASM_LOONGARCH_UNWIND_HINTS_H */
> diff --git a/arch/loongarch/kernel/Makefile b/arch/loongarch/kernel/Makefile
> index 4fcc168..a89428c 100644
> --- a/arch/loongarch/kernel/Makefile
> +++ b/arch/loongarch/kernel/Makefile
> @@ -3,6 +3,8 @@
> # Makefile for the Linux/LoongArch kernel.
> #
>
> +OBJECT_FILES_NON_STANDARD_head.o := y
> +
> extra-y := vmlinux.lds
>
> obj-y += head.o cpu-probe.o cacheinfo.o env.o setup.o entry.o genex.o \
> @@ -62,6 +64,7 @@ obj-$(CONFIG_CRASH_DUMP) += crash_dump.o
>
> obj-$(CONFIG_UNWINDER_GUESS) += unwind_guess.o
> obj-$(CONFIG_UNWINDER_PROLOGUE) += unwind_prologue.o
> +obj-$(CONFIG_UNWINDER_ORC) += unwind_orc.o
>
> obj-$(CONFIG_PERF_EVENTS) += perf_event.o perf_regs.o
> obj-$(CONFIG_HAVE_HW_BREAKPOINT) += hw_breakpoint.o
> diff --git a/arch/loongarch/kernel/entry.S b/arch/loongarch/kernel/entry.S
> index 65518bb..e43115f 100644
> --- a/arch/loongarch/kernel/entry.S
> +++ b/arch/loongarch/kernel/entry.S
> @@ -14,11 +14,13 @@
> #include <asm/regdef.h>
> #include <asm/stackframe.h>
> #include <asm/thread_info.h>
> +#include <asm/unwind_hints.h>
>
> .text
> .cfi_sections .debug_frame
> .align 5
> -SYM_FUNC_START(handle_syscall)
> +SYM_CODE_START(handle_syscall)
> + UNWIND_HINT_UNDEFINED
> csrrd t0, PERCPU_BASE_KS
> la.pcrel t1, kernelsp
> add.d t1, t1, t0
> @@ -56,6 +58,7 @@ SYM_FUNC_START(handle_syscall)
> cfi_st u0, PT_R21
> cfi_st fp, PT_R22
>
> + UNWIND_HINT_REGS
> SAVE_STATIC
>
> #ifdef CONFIG_KGDB
> @@ -71,10 +74,11 @@ SYM_FUNC_START(handle_syscall)
> bl do_syscall
>
> RESTORE_ALL_AND_RET
> -SYM_FUNC_END(handle_syscall)
> +SYM_CODE_END(handle_syscall)
> _ASM_NOKPROBE(handle_syscall)
>
> SYM_CODE_START(ret_from_fork)
> + UNWIND_HINT_REGS
> bl schedule_tail # a0 = struct task_struct *prev
> move a0, sp
> bl syscall_exit_to_user_mode
> @@ -84,6 +88,7 @@ SYM_CODE_START(ret_from_fork)
> SYM_CODE_END(ret_from_fork)
>
> SYM_CODE_START(ret_from_kernel_thread)
> + UNWIND_HINT_REGS
> bl schedule_tail # a0 = struct task_struct *prev
> move a0, s1
> jirl ra, s0, 0
> diff --git a/arch/loongarch/kernel/genex.S b/arch/loongarch/kernel/genex.S
> index 78f0663..3f18e3b 100644
> --- a/arch/loongarch/kernel/genex.S
> +++ b/arch/loongarch/kernel/genex.S
> @@ -31,7 +31,8 @@ SYM_FUNC_START(__arch_cpu_idle)
> 1: jr ra
> SYM_FUNC_END(__arch_cpu_idle)
>
> -SYM_FUNC_START(handle_vint)
> +SYM_CODE_START(handle_vint)
> + UNWIND_HINT_UNDEFINED
> BACKUP_T0T1
> SAVE_ALL
> la_abs t1, __arch_cpu_idle
> @@ -46,11 +47,12 @@ SYM_FUNC_START(handle_vint)
> la_abs t0, do_vint
> jirl ra, t0, 0
> RESTORE_ALL_AND_RET
> -SYM_FUNC_END(handle_vint)
> +SYM_CODE_END(handle_vint)
>
> -SYM_FUNC_START(except_vec_cex)
> +SYM_CODE_START(except_vec_cex)
> + UNWIND_HINT_UNDEFINED
> b cache_parity_error
> -SYM_FUNC_END(except_vec_cex)
> +SYM_CODE_END(except_vec_cex)
>
> .macro build_prep_badv
> csrrd t0, LOONGARCH_CSR_BADV
> @@ -66,7 +68,8 @@ SYM_FUNC_END(except_vec_cex)
>
> .macro BUILD_HANDLER exception handler prep
> .align 5
> - SYM_FUNC_START(handle_\exception)
> + SYM_CODE_START(handle_\exception)
> + UNWIND_HINT_UNDEFINED
> 666:
> BACKUP_T0T1
> SAVE_ALL
> @@ -76,7 +79,7 @@ SYM_FUNC_END(except_vec_cex)
> jirl ra, t0, 0
> 668:
> RESTORE_ALL_AND_RET
> - SYM_FUNC_END(handle_\exception)
> + SYM_CODE_END(handle_\exception)
> SYM_DATA(unwind_hint_\exception, .word 668b - 666b)
> .endm
>
> @@ -93,7 +96,8 @@ SYM_FUNC_END(except_vec_cex)
> BUILD_HANDLER watch watch none
> BUILD_HANDLER reserved reserved none /* others */
>
> -SYM_FUNC_START(handle_sys)
> +SYM_CODE_START(handle_sys)
> + UNWIND_HINT_UNDEFINED
> la_abs t0, handle_syscall
> jr t0
> -SYM_FUNC_END(handle_sys)
> +SYM_CODE_END(handle_sys)
> diff --git a/arch/loongarch/kernel/head.S b/arch/loongarch/kernel/head.S
> index 53b883d..5664390 100644
> --- a/arch/loongarch/kernel/head.S
> +++ b/arch/loongarch/kernel/head.S
> @@ -43,6 +43,7 @@ SYM_DATA(kernel_offset, .long _kernel_offset);
> .align 12
>
> SYM_CODE_START(kernel_entry) # kernel entry point
> + UNWIND_HINT_EMPTY
>
> /* Config direct window and set PG */
> li.d t0, CSR_DMW0_INIT # UC, PLV0, 0x8000 xxxx xxxx xxxx
> diff --git a/arch/loongarch/kernel/module.c b/arch/loongarch/kernel/module.c
> index b13b285..83db7e5 100644
> --- a/arch/loongarch/kernel/module.c
> +++ b/arch/loongarch/kernel/module.c
> @@ -20,6 +20,7 @@
> #include <linux/kernel.h>
> #include <asm/alternative.h>
> #include <asm/inst.h>
> +#include <asm/unwind.h>
>
> static int rela_stack_push(s64 stack_value, s64 *rela_stack, size_t *rela_stack_top)
> {
> @@ -515,7 +516,7 @@ static void module_init_ftrace_plt(const Elf_Ehdr *hdr,
> int module_finalize(const Elf_Ehdr *hdr,
> const Elf_Shdr *sechdrs, struct module *mod)
> {
> - const Elf_Shdr *s, *se;
> + const Elf_Shdr *s, *se, *orc = NULL, *orc_ip = NULL;
> const char *secstrs = (void *)hdr + sechdrs[hdr->e_shstrndx].sh_offset;
>
> for (s = sechdrs, se = sechdrs + hdr->e_shnum; s < se; s++) {
> @@ -523,7 +524,15 @@ int module_finalize(const Elf_Ehdr *hdr,
> apply_alternatives((void *)s->sh_addr, (void *)s->sh_addr + s->sh_size);
> if (!strcmp(".ftrace_trampoline", secstrs + s->sh_name))
> module_init_ftrace_plt(hdr, s, mod);
> + if (!strcmp(".orc_unwind", secstrs + s->sh_name))
> + orc = s;
> + if (!strcmp(".orc_unwind_ip", secstrs + s->sh_name))
> + orc_ip = s;
> }
>
> + if (orc && orc_ip)
> + unwind_module_init(mod, (void *)orc_ip->sh_addr, orc_ip->sh_size,
> + (void *)orc->sh_addr, orc->sh_size);
> +
> return 0;
> }
> diff --git a/arch/loongarch/kernel/relocate_kernel.S b/arch/loongarch/kernel/relocate_kernel.S
> index f49f6b0..bcc191d 100644
> --- a/arch/loongarch/kernel/relocate_kernel.S
> +++ b/arch/loongarch/kernel/relocate_kernel.S
> @@ -15,6 +15,7 @@
> #include <asm/addrspace.h>
>
> SYM_CODE_START(relocate_new_kernel)
> + UNWIND_HINT_UNDEFINED
> /*
> * a0: EFI boot flag for the new kernel
> * a1: Command line pointer for the new kernel
> @@ -90,6 +91,7 @@ SYM_CODE_END(relocate_new_kernel)
> * then start at the entry point from LOONGARCH_IOCSR_MBUF0.
> */
> SYM_CODE_START(kexec_smp_wait)
> + UNWIND_HINT_UNDEFINED
> 1: li.w t0, 0x100 /* wait for init loop */
> 2: addi.w t0, t0, -1 /* limit mailbox access */
> bnez t0, 2b
> diff --git a/arch/loongarch/kernel/setup.c b/arch/loongarch/kernel/setup.c
> index 7783f0a..a173b02 100644
> --- a/arch/loongarch/kernel/setup.c
> +++ b/arch/loongarch/kernel/setup.c
> @@ -48,6 +48,7 @@
> #include <asm/sections.h>
> #include <asm/setup.h>
> #include <asm/time.h>
> +#include <asm/unwind.h>
>
> #define SMBIOS_BIOSSIZE_OFFSET 0x09
> #define SMBIOS_BIOSEXTERN_OFFSET 0x13
> @@ -605,6 +606,7 @@ static void __init prefill_possible_map(void)
>
> void __init setup_arch(char **cmdline_p)
> {
> + unwind_init();
> cpu_probe();
>
> init_environ();
> diff --git a/arch/loongarch/kernel/stacktrace.c b/arch/loongarch/kernel/stacktrace.c
> index 92270f1..9848d42 100644
> --- a/arch/loongarch/kernel/stacktrace.c
> +++ b/arch/loongarch/kernel/stacktrace.c
> @@ -29,6 +29,7 @@ void arch_stack_walk(stack_trace_consume_fn consume_entry, void *cookie,
> regs->csr_era = thread_saved_ra(task);
> }
> regs->regs[1] = 0;
> + regs->regs[22] = 0;
> }
>
> for (unwind_start(&state, task, regs);
> diff --git a/arch/loongarch/kernel/unwind_orc.c b/arch/loongarch/kernel/unwind_orc.c
> new file mode 100644
> index 0000000..08f80ca0
> --- /dev/null
> +++ b/arch/loongarch/kernel/unwind_orc.c
> @@ -0,0 +1,571 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +#include <linux/objtool.h>
> +#include <linux/module.h>
> +#include <linux/sort.h>
> +#include <asm/exception.h>
> +#include <asm/orc_types.h>
> +#include <asm/orc_lookup.h>
> +#include <asm/orc_header.h>
> +#include <asm/ptrace.h>
> +#include <asm/setup.h>
> +#include <asm/stacktrace.h>
> +#include <asm/tlb.h>
> +#include <asm/unwind.h>
> +
> +ORC_HEADER;
> +
> +#define orc_warn(fmt, ...) \
> + printk_deferred_once(KERN_WARNING "WARNING: " fmt, ##__VA_ARGS__)
> +
> +extern int __start_orc_unwind_ip[];
> +extern int __stop_orc_unwind_ip[];
> +extern struct orc_entry __start_orc_unwind[];
> +extern struct orc_entry __stop_orc_unwind[];
> +
> +static bool orc_init __ro_after_init;
> +static unsigned int lookup_num_blocks __ro_after_init;
> +
> +/* Fake frame pointer entry -- used as a fallback for generated code */
> +static struct orc_entry orc_fp_entry = {
> + .type = UNWIND_HINT_TYPE_CALL,
> + .sp_reg = ORC_REG_BP,
> + .sp_offset = 16,
> + .bp_reg = ORC_REG_PREV_SP,
> + .bp_offset = -16,
> + .ra_reg = ORC_REG_PREV_SP,
> + .ra_offset = -8,
> +};
> +
> +static inline unsigned long orc_ip(const int *ip)
> +{
> + return (unsigned long)ip + *ip;
> +}
> +
> +static struct orc_entry *__orc_find(int *ip_table, struct orc_entry *u_table,
> + unsigned int num_entries, unsigned long ip)
> +{
> + int *first = ip_table;
> + int *last = ip_table + num_entries - 1;
> + int *mid = first, *found = first;
> +
> + if (!num_entries)
> + return NULL;
> +
> + /*
> + * Do a binary range search to find the rightmost duplicate of a given
> + * starting address. Some entries are section terminators which are
> + * "weak" entries for ensuring there are no gaps. They should be
> + * ignored when they conflict with a real entry.
> + */
> + while (first <= last) {
> + mid = first + ((last - first) / 2);
> +
> + if (orc_ip(mid) <= ip) {
> + found = mid;
> + first = mid + 1;
> + } else
> + last = mid - 1;
> + }
> +
> + return u_table + (found - ip_table);
> +}
> +
> +#ifdef CONFIG_MODULES
> +static struct orc_entry *orc_module_find(unsigned long ip)
> +{
> + struct module *mod;
> +
> + mod = __module_address(ip);
> + if (!mod || !mod->arch.orc_unwind || !mod->arch.orc_unwind_ip)
> + return NULL;
> + return __orc_find(mod->arch.orc_unwind_ip, mod->arch.orc_unwind,
> + mod->arch.num_orcs, ip);
> +}
> +#else
> +static struct orc_entry *orc_module_find(unsigned long ip)
> +{
> + return NULL;
> +}
> +#endif
> +
> +#ifdef CONFIG_DYNAMIC_FTRACE
> +static struct orc_entry *orc_find(unsigned long ip);
> +
> +/*
> + * Ftrace dynamic trampolines do not have orc entries of their own.
> + * But they are copies of the ftrace entries that are static and
> + * defined in ftrace_*.S, which do have orc entries.
> + *
> + * If the unwinder comes across a ftrace trampoline, then find the
> + * ftrace function that was used to create it, and use that ftrace
> + * function's orc entry, as the placement of the return code in
> + * the stack will be identical.
> + */
> +static struct orc_entry *orc_ftrace_find(unsigned long ip)
> +{
> + struct ftrace_ops *ops;
> + unsigned long tramp_addr, offset;
> +
> + ops = ftrace_ops_trampoline(ip);
> + if (!ops)
> + return NULL;
> +
> + /* Set tramp_addr to the start of the code copied by the trampoline */
> + if (ops->flags & FTRACE_OPS_FL_SAVE_REGS)
> + tramp_addr = (unsigned long)ftrace_regs_caller;
> + else
> + tramp_addr = (unsigned long)ftrace_caller;
> +
> + /* Now place tramp_addr to the location within the trampoline ip is at */
> + offset = ip - ops->trampoline;
> + tramp_addr += offset;
> +
> + /* Prevent unlikely recursion */
> + if (ip == tramp_addr)
> + return NULL;
> +
> + return orc_find(tramp_addr);
> +}
> +#else
> +static struct orc_entry *orc_ftrace_find(unsigned long ip)
> +{
> + return NULL;
> +}
> +#endif
> +
> +/*
> + * If we crash with IP==0, the last successfully executed instruction
> + * was probably an indirect function call with a NULL function pointer,
> + * and we don't have unwind information for NULL.
> + * This hardcoded ORC entry for IP==0 allows us to unwind from a NULL function
> + * pointer into its parent and then continue normally from there.
> + */
> +static struct orc_entry null_orc_entry = {
> + .sp_offset = sizeof(long),
> + .sp_reg = ORC_REG_SP,
> + .bp_reg = ORC_REG_UNDEFINED,
> + .type = ORC_TYPE_CALL
> +};
> +
> +static struct orc_entry *orc_find(unsigned long ip)
> +{
> + static struct orc_entry *orc;
> +
> + if (ip == 0)
> + return &null_orc_entry;
> +
> + /* For non-init vmlinux addresses, use the fast lookup table: */
> + if (ip >= LOOKUP_START_IP && ip < LOOKUP_STOP_IP) {
> + unsigned int idx, start, stop;
> +
> + idx = (ip - LOOKUP_START_IP) / LOOKUP_BLOCK_SIZE;
> +
> + if (unlikely((idx >= lookup_num_blocks-1))) {
> + orc_warn("WARNING: bad lookup idx: idx=%u num=%u ip=%pB\n",
> + idx, lookup_num_blocks, (void *)ip);
> + return NULL;
> + }
> +
> + start = orc_lookup[idx];
> + stop = orc_lookup[idx + 1] + 1;
> +
> + if (unlikely((__start_orc_unwind + start >= __stop_orc_unwind) ||
> + (__start_orc_unwind + stop > __stop_orc_unwind))) {
> + orc_warn("WARNING: bad lookup value: idx=%u num=%u start=%u stop=%u ip=%pB\n",
> + idx, lookup_num_blocks, start, stop, (void *)ip);
> + return NULL;
> + }
> +
> + return __orc_find(__start_orc_unwind_ip + start,
> + __start_orc_unwind + start, stop - start, ip);
> + }
> +
> + /* vmlinux .init slow lookup: */
> + if (is_kernel_inittext(ip))
> + return __orc_find(__start_orc_unwind_ip, __start_orc_unwind,
> + __stop_orc_unwind_ip - __start_orc_unwind_ip, ip);
> +
> + /* Module lookup: */
> + orc = orc_module_find(ip);
> + if (orc)
> + return orc;
> +
> + return orc_ftrace_find(ip);
> +}
> +
> +#ifdef CONFIG_MODULES
> +
> +static DEFINE_MUTEX(sort_mutex);
> +static int *cur_orc_ip_table = __start_orc_unwind_ip;
> +static struct orc_entry *cur_orc_table = __start_orc_unwind;
> +
> +static void orc_sort_swap(void *_a, void *_b, int size)
> +{
> + struct orc_entry *orc_a, *orc_b;
> + int *a = _a, *b = _b, tmp;
> + int delta = _b - _a;
> +
> + /* Swap the .orc_unwind_ip entries: */
> + tmp = *a;
> + *a = *b + delta;
> + *b = tmp - delta;
> +
> + /* Swap the corresponding .orc_unwind entries: */
> + orc_a = cur_orc_table + (a - cur_orc_ip_table);
> + orc_b = cur_orc_table + (b - cur_orc_ip_table);
> + swap(*orc_a, *orc_b);
> +}
> +
> +static int orc_sort_cmp(const void *_a, const void *_b)
> +{
> + struct orc_entry *orc_a;
> + const int *a = _a, *b = _b;
> + unsigned long a_val = orc_ip(a);
> + unsigned long b_val = orc_ip(b);
> +
> + if (a_val > b_val)
> + return 1;
> + if (a_val < b_val)
> + return -1;
> +
> + /*
> + * The "weak" section terminator entries need to always be first
> + * to ensure the lookup code skips them in favor of real entries.
> + * These terminator entries exist to handle any gaps created by
> + * whitelisted .o files which didn't get objtool generation.
> + */
> + orc_a = cur_orc_table + (a - cur_orc_ip_table);
> + return orc_a->type == ORC_TYPE_UNDEFINED ? -1 : 1;
> +}
> +
> +void unwind_module_init(struct module *mod, void *_orc_ip, size_t orc_ip_size,
> + void *_orc, size_t orc_size)
> +{
> + int *orc_ip = _orc_ip;
> + struct orc_entry *orc = _orc;
> + unsigned int num_entries = orc_ip_size / sizeof(int);
> +
> + WARN_ON_ONCE(orc_ip_size % sizeof(int) != 0 ||
> + orc_size % sizeof(*orc) != 0 ||
> + num_entries != orc_size / sizeof(*orc));
> +
> + /*
> + * The 'cur_orc_*' globals allow the orc_sort_swap() callback to
> + * associate an .orc_unwind_ip table entry with its corresponding
> + * .orc_unwind entry so they can both be swapped.
> + */
> + mutex_lock(&sort_mutex);
> + cur_orc_ip_table = orc_ip;
> + cur_orc_table = orc;
> + sort(orc_ip, num_entries, sizeof(int), orc_sort_cmp, orc_sort_swap);
> + mutex_unlock(&sort_mutex);
> +
> + mod->arch.orc_unwind_ip = orc_ip;
> + mod->arch.orc_unwind = orc;
> + mod->arch.num_orcs = num_entries;
> +}
> +#endif
> +
> +void __init unwind_init(void)
> +{
> + size_t orc_ip_size = (void *)__stop_orc_unwind_ip - (void *)__start_orc_unwind_ip;
> + size_t orc_size = (void *)__stop_orc_unwind - (void *)__start_orc_unwind;
> + size_t num_entries = orc_ip_size / sizeof(int);
> + struct orc_entry *orc;
> + int i;
> +
> + if (!num_entries || orc_ip_size % sizeof(int) != 0 ||
> + orc_size % sizeof(struct orc_entry) != 0 ||
> + num_entries != orc_size / sizeof(struct orc_entry)) {
> + orc_warn("WARNING: Bad or missing .orc_unwind table. Disabling unwinder.\n");
> + return;
> + }
> +
> + /*
> + * Note, the orc_unwind and orc_unwind_ip tables were already
> + * sorted at build time via the 'sorttable' tool.
> + * It's ready for binary search straight away, no need to sort it.
> + */
> +
> + /* Initialize the fast lookup table: */
> + lookup_num_blocks = orc_lookup_end - orc_lookup;
> + for (i = 0; i < lookup_num_blocks-1; i++) {
> + orc = __orc_find(__start_orc_unwind_ip, __start_orc_unwind,
> + num_entries,
> + LOOKUP_START_IP + (LOOKUP_BLOCK_SIZE * i));
> + if (!orc) {
> + orc_warn("WARNING: Corrupt .orc_unwind table. Disabling unwinder.\n");
> + return;
> + }
> +
> + orc_lookup[i] = orc - __start_orc_unwind;
> + }
> +
> + /* Initialize the ending block: */
> + orc = __orc_find(__start_orc_unwind_ip, __start_orc_unwind, num_entries,
> + LOOKUP_STOP_IP);
> + if (!orc) {
> + orc_warn("WARNING: Corrupt .orc_unwind table. Disabling unwinder.\n");
> + return;
> + }
> + orc_lookup[lookup_num_blocks-1] = orc - __start_orc_unwind;
> +
> + orc_init = true;
> +}
> +
> +static inline bool on_stack(struct stack_info *info, unsigned long addr, size_t len)
> +{
> + unsigned long begin = info->begin;
> + unsigned long end = info->end;
> +
> + return (info->type != STACK_TYPE_UNKNOWN &&
> + addr >= begin && addr < end &&
> + addr + len > begin && addr + len <= end);
> +}
> +
> +static bool stack_access_ok(struct unwind_state *state, unsigned long addr,
> + size_t len)
> +{
> + struct stack_info *info = &state->stack_info;
> +
> + if (on_stack(info, addr, len))
> + return true;
> +
> + return !get_stack_info(addr, state->task, info) &&
> + on_stack(info, addr, len);
> +}
> +
> +unsigned long unwind_get_return_address(struct unwind_state *state)
> +{
> + return __unwind_get_return_address(state);
> +}
> +EXPORT_SYMBOL_GPL(unwind_get_return_address);
> +
> +void unwind_start(struct unwind_state *state, struct task_struct *task,
> + struct pt_regs *regs)
> +{
> + __unwind_start(state, task, regs);
> + if (!unwind_done(state) && !__kernel_text_address(state->pc))
> + unwind_next_frame(state);
> +}
> +EXPORT_SYMBOL_GPL(unwind_start);
> +
> +static bool is_entry_func(unsigned long addr)
> +{
> + extern u32 kernel_entry;
> + extern u32 kernel_entry_end;
> +
> + return addr >= (unsigned long)&kernel_entry &&
> + addr < (unsigned long)&kernel_entry_end;
> +}
> +
> +static inline unsigned long bt_address(unsigned long ra)
> +{
> + extern unsigned long eentry;
> +
> + if (__kernel_text_address(ra))
> + return ra;
> +
> + /* We are in preempt_disable() here */
> + if (__module_text_address(ra))
> + return ra;
> +
> + if (ra >= eentry && ra < eentry + EXCCODE_INT_END * VECSIZE) {
> + unsigned long type = (ra - eentry) / VECSIZE;
> + unsigned long offset = (ra - eentry) % VECSIZE;
> + unsigned long func;
> +
> + switch (type) {
> + case EXCCODE_TLBL:
> + case EXCCODE_TLBI:
> + func = (unsigned long)handle_tlb_load;
> + break;
> + case EXCCODE_TLBS:
> + func = (unsigned long)handle_tlb_store;
> + break;
> + case EXCCODE_TLBM:
> + func = (unsigned long)handle_tlb_modify;
> + break;
> + case EXCCODE_TLBNR:
> + case EXCCODE_TLBNX:
> + case EXCCODE_TLBPE:
> + func = (unsigned long)handle_tlb_protect;
> + break;
> + case EXCCODE_ADE:
> + func = (unsigned long)handle_ade;
> + break;
> + case EXCCODE_ALE:
> + func = (unsigned long)handle_ale;
> + break;
> + case EXCCODE_BCE:
> + func = (unsigned long)handle_bce;
> + break;
> + case EXCCODE_SYS:
> + func = (unsigned long)handle_sys;
> + break;
> + case EXCCODE_BP:
> + func = (unsigned long)handle_bp;
> + break;
> + case EXCCODE_INE:
> + case EXCCODE_IPE:
> + func = (unsigned long)handle_ri;
> + break;
> + case EXCCODE_FPDIS:
> + func = (unsigned long)handle_fpu;
> + break;
> + case EXCCODE_LSXDIS:
> + func = (unsigned long)handle_lsx;
> + break;
> + case EXCCODE_LASXDIS:
> + func = (unsigned long)handle_lasx;
> + break;
> + case EXCCODE_FPE:
> + func = (unsigned long)handle_fpe;
> + break;
> + case EXCCODE_WATCH:
> + func = (unsigned long)handle_watch;
> + break;
> + case EXCCODE_BTDIS:
> + func = (unsigned long)handle_lbt;
> + break;
> + case EXCCODE_INT_START ... EXCCODE_INT_END - 1:
> + func = (unsigned long)handle_vint;
> + break;
> + default:
> + func = (unsigned long)handle_reserved;
> + break;
> + }
> +
> + return func + offset;
> + }
> +
> + return ra;
> +}
> +
> +bool unwind_next_frame(struct unwind_state *state)
> +{
> + struct stack_info *info = &state->stack_info;
> + struct orc_entry *orc;
> + struct pt_regs *regs;
> + unsigned long *p, pc;
> +
> + if (unwind_done(state))
> + return false;
> +
> + /* Don't let modules unload while we're reading their ORC data. */
> + preempt_disable();
> +
> + if (is_entry_func(state->pc))
> + goto end;
> +
> + orc = orc_find(state->pc);
> + if (!orc) {
> + orc = &orc_fp_entry;
> + state->error = true;
> + }
> +
> + switch (orc->sp_reg) {
> + case ORC_REG_SP:
> + state->sp = state->sp + orc->sp_offset;
> + break;
> + case ORC_REG_BP:
> + state->sp = state->fp;
> + break;
> + default:
> + orc_warn("unknown SP base reg %d at %pB\n",
> + orc->sp_reg, (void *)state->pc);
> + goto err;
> + }
> +
> + switch (orc->bp_reg) {
> + case ORC_REG_PREV_SP:
> + p = (unsigned long *)(state->sp + orc->bp_offset);
> + if (!stack_access_ok(state, (unsigned long)p, sizeof(unsigned long)))
> + goto err;
> +
> + state->fp = *p;
> + break;
> + case ORC_REG_UNDEFINED:
> + /* Nothing. */
> + break;
> + default:
> + orc_warn("unknown FP base reg %d at %pB\n",
> + orc->bp_reg, (void *)state->pc);
> + goto err;
> + }
> +
> + switch (orc->type) {
> + case UNWIND_HINT_TYPE_CALL:
> + if (orc->ra_reg == ORC_REG_PREV_SP) {
> + p = (unsigned long *)(state->sp + orc->ra_offset);
> + if (!stack_access_ok(state, (unsigned long)p, sizeof(unsigned long)))
> + goto err;
> +
> + pc = unwind_graph_addr(state, *p, state->sp);
> + pc -= LOONGARCH_INSN_SIZE;
> + } else if (orc->ra_reg == ORC_REG_UNDEFINED) {
> + if (!state->ra || state->ra == state->pc)
> + goto err;
> +
> + pc = unwind_graph_addr(state, state->ra, state->sp);
> + pc -= LOONGARCH_INSN_SIZE;
> + state->ra = 0;
> + } else {
> + orc_warn("unknown ra base reg %d at %pB\n",
> + orc->ra_reg, (void *)state->pc);
> + goto err;
> + }
> + break;
> + case UNWIND_HINT_TYPE_REGS:
> + if (state->stack_info.type == STACK_TYPE_IRQ && state->sp == info->end)
> + regs = (struct pt_regs *)info->next_sp;
> + else
> + regs = (struct pt_regs *)state->sp;
> +
> + if (!stack_access_ok(state, (unsigned long)regs, sizeof(*regs)))
> + goto err;
> +
> + if ((info->end == (unsigned long)regs + sizeof(*regs)) &&
> + !regs->regs[3] && !regs->regs[1])
> + goto end;
> +
> + if (user_mode(regs))
> + goto end;
> +
> + pc = regs->csr_era;
> + if (!__kernel_text_address(pc))
> + goto err;
> +
> + state->sp = regs->regs[3];
> + state->ra = regs->regs[1];
> + state->fp = regs->regs[22];
> + get_stack_info(state->sp, state->task, info);
> +
> + break;
> + default:
> + orc_warn("unknown .orc_unwind entry type %d at %pB\n",
> + orc->type, (void *)state->pc);
> + goto err;
> + }
> +
> + state->pc = bt_address(pc);
> + if (!state->pc) {
> + pr_err("cannot find unwind pc at %pK\n", (void *)pc);
> + goto err;
> + }
> +
> + if (!__kernel_text_address(state->pc))
> + goto err;
> +
> + preempt_enable();
> + return true;
> +
> +err:
> + state->error = true;
> +
> +end:
> + preempt_enable();
> + state->stack_info.type = STACK_TYPE_UNKNOWN;
> + return false;
> +}
> +EXPORT_SYMBOL_GPL(unwind_next_frame);
> diff --git a/arch/loongarch/kernel/vmlinux.lds.S b/arch/loongarch/kernel/vmlinux.lds.S
> index bb2ec86..09fd4eb 100644
> --- a/arch/loongarch/kernel/vmlinux.lds.S
> +++ b/arch/loongarch/kernel/vmlinux.lds.S
> @@ -2,6 +2,7 @@
> #include <linux/sizes.h>
> #include <asm/asm-offsets.h>
> #include <asm/thread_info.h>
> +#include <asm/orc_lookup.h>
>
> #define PAGE_SIZE _PAGE_SIZE
> #define RO_EXCEPTION_TABLE_ALIGN 4
> @@ -99,6 +100,8 @@ SECTIONS
> _sdata = .;
> RO_DATA(4096)
>
> + ORC_UNWIND_TABLE
> +
> .got : ALIGN(16) { *(.got) }
> .plt : ALIGN(16) { *(.plt) }
> .got.plt : ALIGN(16) { *(.got.plt) }
> diff --git a/arch/loongarch/lib/Makefile b/arch/loongarch/lib/Makefile
> index a77bf160..e3023d9 100644
> --- a/arch/loongarch/lib/Makefile
> +++ b/arch/loongarch/lib/Makefile
> @@ -3,6 +3,8 @@
> # Makefile for LoongArch-specific library files.
> #
>
> +OBJECT_FILES_NON_STANDARD := y
> +
> lib-y += delay.o memset.o memcpy.o memmove.o \
> clear_user.o copy_user.o csum.o dump_tlb.o unaligned.o
>
> diff --git a/arch/loongarch/mm/tlbex.S b/arch/loongarch/mm/tlbex.S
> index ca17dd3..a44387b 100644
> --- a/arch/loongarch/mm/tlbex.S
> +++ b/arch/loongarch/mm/tlbex.S
> @@ -17,7 +17,8 @@
> #define PTRS_PER_PTE_BITS (PAGE_SHIFT - 3)
>
> .macro tlb_do_page_fault, write
> - SYM_FUNC_START(tlb_do_page_fault_\write)
> + SYM_CODE_START(tlb_do_page_fault_\write)
> + UNWIND_HINT_UNDEFINED
> SAVE_ALL
> csrrd a2, LOONGARCH_CSR_BADV
> move a0, sp
> @@ -25,13 +26,14 @@
> li.w a1, \write
> bl do_page_fault
> RESTORE_ALL_AND_RET
> - SYM_FUNC_END(tlb_do_page_fault_\write)
> + SYM_CODE_END(tlb_do_page_fault_\write)
> .endm
>
> tlb_do_page_fault 0
> tlb_do_page_fault 1
>
> -SYM_FUNC_START(handle_tlb_protect)
> +SYM_CODE_START(handle_tlb_protect)
> + UNWIND_HINT_UNDEFINED
> BACKUP_T0T1
> SAVE_ALL
> move a0, sp
> @@ -41,9 +43,10 @@ SYM_FUNC_START(handle_tlb_protect)
> la_abs t0, do_page_fault
> jirl ra, t0, 0
> RESTORE_ALL_AND_RET
> -SYM_FUNC_END(handle_tlb_protect)
> +SYM_CODE_END(handle_tlb_protect)
>
> -SYM_FUNC_START(handle_tlb_load)
> +SYM_CODE_START(handle_tlb_load)
> + UNWIND_HINT_UNDEFINED
> csrwr t0, EXCEPTION_KS0
> csrwr t1, EXCEPTION_KS1
> csrwr ra, EXCEPTION_KS2
> @@ -187,16 +190,18 @@ nopage_tlb_load:
> csrrd ra, EXCEPTION_KS2
> la_abs t0, tlb_do_page_fault_0
> jr t0
> -SYM_FUNC_END(handle_tlb_load)
> +SYM_CODE_END(handle_tlb_load)
>
> -SYM_FUNC_START(handle_tlb_load_ptw)
> +SYM_CODE_START(handle_tlb_load_ptw)
> + UNWIND_HINT_UNDEFINED
> csrwr t0, LOONGARCH_CSR_KS0
> csrwr t1, LOONGARCH_CSR_KS1
> la_abs t0, tlb_do_page_fault_0
> jr t0
> -SYM_FUNC_END(handle_tlb_load_ptw)
> +SYM_CODE_END(handle_tlb_load_ptw)
>
> -SYM_FUNC_START(handle_tlb_store)
> +SYM_CODE_START(handle_tlb_store)
> + UNWIND_HINT_UNDEFINED
> csrwr t0, EXCEPTION_KS0
> csrwr t1, EXCEPTION_KS1
> csrwr ra, EXCEPTION_KS2
> @@ -343,16 +348,18 @@ nopage_tlb_store:
> csrrd ra, EXCEPTION_KS2
> la_abs t0, tlb_do_page_fault_1
> jr t0
> -SYM_FUNC_END(handle_tlb_store)
> +SYM_CODE_END(handle_tlb_store)
>
> -SYM_FUNC_START(handle_tlb_store_ptw)
> +SYM_CODE_START(handle_tlb_store_ptw)
> + UNWIND_HINT_UNDEFINED
> csrwr t0, LOONGARCH_CSR_KS0
> csrwr t1, LOONGARCH_CSR_KS1
> la_abs t0, tlb_do_page_fault_1
> jr t0
> -SYM_FUNC_END(handle_tlb_store_ptw)
> +SYM_CODE_END(handle_tlb_store_ptw)
>
> -SYM_FUNC_START(handle_tlb_modify)
> +SYM_CODE_START(handle_tlb_modify)
> + UNWIND_HINT_UNDEFINED
> csrwr t0, EXCEPTION_KS0
> csrwr t1, EXCEPTION_KS1
> csrwr ra, EXCEPTION_KS2
> @@ -497,16 +504,18 @@ nopage_tlb_modify:
> csrrd ra, EXCEPTION_KS2
> la_abs t0, tlb_do_page_fault_1
> jr t0
> -SYM_FUNC_END(handle_tlb_modify)
> +SYM_CODE_END(handle_tlb_modify)
>
> -SYM_FUNC_START(handle_tlb_modify_ptw)
> +SYM_CODE_START(handle_tlb_modify_ptw)
> + UNWIND_HINT_UNDEFINED
> csrwr t0, LOONGARCH_CSR_KS0
> csrwr t1, LOONGARCH_CSR_KS1
> la_abs t0, tlb_do_page_fault_1
> jr t0
> -SYM_FUNC_END(handle_tlb_modify_ptw)
> +SYM_CODE_END(handle_tlb_modify_ptw)
>
> -SYM_FUNC_START(handle_tlb_refill)
> +SYM_CODE_START(handle_tlb_refill)
> + UNWIND_HINT_UNDEFINED
> csrwr t0, LOONGARCH_CSR_TLBRSAVE
> csrrd t0, LOONGARCH_CSR_PGD
> lddir t0, t0, 3
> @@ -521,4 +530,4 @@ SYM_FUNC_START(handle_tlb_refill)
> tlbfill
> csrrd t0, LOONGARCH_CSR_TLBRSAVE
> ertn
> -SYM_FUNC_END(handle_tlb_refill)
> +SYM_CODE_END(handle_tlb_refill)
> diff --git a/arch/loongarch/power/Makefile b/arch/loongarch/power/Makefile
> index 58151d0..bbd1d47 100644
> --- a/arch/loongarch/power/Makefile
> +++ b/arch/loongarch/power/Makefile
> @@ -1,3 +1,5 @@
> +OBJECT_FILES_NON_STANDARD_suspend_asm.o := y
> +
> obj-y += platform.o
>
> obj-$(CONFIG_SUSPEND) += suspend.o suspend_asm.o
> diff --git a/arch/loongarch/vdso/Makefile b/arch/loongarch/vdso/Makefile
> index 5c97d1463..997f41c 100644
> --- a/arch/loongarch/vdso/Makefile
> +++ b/arch/loongarch/vdso/Makefile
> @@ -3,6 +3,7 @@
>
> KASAN_SANITIZE := n
> KCOV_INSTRUMENT := n
> +OBJECT_FILES_NON_STANDARD := y
>
> # Include the generic Makefile to check the built vdso.
> include $(srctree)/lib/vdso/Makefile
> diff --git a/include/linux/compiler.h b/include/linux/compiler.h
> index d7779a1..df29ddb 100644
> --- a/include/linux/compiler.h
> +++ b/include/linux/compiler.h
> @@ -116,6 +116,14 @@ void ftrace_likely_update(struct ftrace_likely_data *f, int val,
> */
> #define __stringify_label(n) #n
>
> +#define __annotate_reachable(c) ({ \
> + asm volatile(__stringify_label(c) ":\n\t" \
> + ".pushsection .discard.reachable\n\t" \
> + ".long " __stringify_label(c) "b - .\n\t" \
> + ".popsection\n\t"); \
> +})
> +#define annotate_reachable() __annotate_reachable(__COUNTER__)
> +
> #define __annotate_unreachable(c) ({ \
> asm volatile(__stringify_label(c) ":\n\t" \
> ".pushsection .discard.unreachable\n\t" \
> @@ -128,6 +136,7 @@ void ftrace_likely_update(struct ftrace_likely_data *f, int val,
> #define __annotate_jump_table __section(".rodata..c_jump_table")
>
> #else /* !CONFIG_OBJTOOL */
> +#define annotate_reachable()
> #define annotate_unreachable()
> #define __annotate_jump_table
> #endif /* CONFIG_OBJTOOL */
> diff --git a/scripts/Makefile b/scripts/Makefile
> index 576cf64..baaed78 100644
> --- a/scripts/Makefile
> +++ b/scripts/Makefile
> @@ -33,7 +33,10 @@ ifdef CONFIG_UNWINDER_ORC
> ifeq ($(ARCH),x86_64)
> ARCH := x86
> endif
> -HOSTCFLAGS_sorttable.o += -I$(srctree)/tools/arch/x86/include
> +ifeq ($(ARCH),loongarch)
> +ARCH := loongarch
> +endif
> +HOSTCFLAGS_sorttable.o += -I$(srctree)/tools/arch/$(ARCH)/include
> HOSTCFLAGS_sorttable.o += -DUNWINDER_ORC_ENABLED
> endif
>
> --
> 2.1.0
>
>

2023-10-15 12:58:22

by Jinyang He

[permalink] [raw]
Subject: Re: [PATCH v2 8/8] LoongArch: Add ORC unwinder support

On 2023-10-14 19:37, Huacai Chen wrote:

> +CC Jinyang
>
> On Sat, Oct 14, 2023 at 5:21 PM Tiezhu Yang <[email protected]> wrote:
>>
>>
>> On 10/11/2023 12:37 PM, Huacai Chen wrote:
>>> Hi, Tiezhu,
>>>
>>> Maybe "LoongArch: Add ORC stack unwinder support" is better.
>> OK, will modify it.
>>
>>> On Mon, Oct 9, 2023 at 9:03 PM Tiezhu Yang <[email protected]> wrote:
>>>> The kernel CONFIG_UNWINDER_ORC option enables the ORC unwinder, which is
>>>> similar in concept to a DWARF unwinder. The difference is that the format
>>>> of the ORC data is much simpler than DWARF, which in turn allows the ORC
>>>> unwinder to be much simpler and faster.
>> ...
>>
>>>> +ifdef CONFIG_OBJTOOL
>>>> +# https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=ecb802d02eeb
>>>> +# https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=816029e06768
>>>> +ifeq ($(shell as --help 2>&1 | grep -e '-mthin-add-sub'),)
>>>> + $(error Sorry, you need a newer gas version with -mthin-add-sub option)
>>> I prefer no error out here, because without this option we can still
>>> built a runnable kernel.
>> I agree with you that it is better to not error out to stop compilation,
>> but there are many objtool warnings during the compile process with old
>> binutils, so it is necessary to give a warning so that the users know
>> what happened and how to fix the lots of objtool warnings.
>>
>> That is to say, I would prefer to replace "error" with "warning".
>>
>>>> +endif
>>>> +KBUILD_AFLAGS += $(call cc-option,-mthin-add-sub) $(call cc-option,-Wa$(comma)-mthin-add-sub)
>>>> +KBUILD_CFLAGS += $(call cc-option,-mthin-add-sub) $(call cc-option,-Wa$(comma)-mthin-add-sub)
>>>> +KBUILD_CFLAGS += -fno-optimize-sibling-calls -fno-jump-tables -falign-functions=4
>>>> +endif
>> ...
>>
>>>> +#define ORC_REG_BP 3
>>> Use FP instead of BP in this patch, too.
>> OK, will do it.
>>
>>>> +#define ORC_REG_MAX 4
>> ...
>>
>>>> +.macro UNWIND_HINT_UNDEFINED
>>>> + UNWIND_HINT type=UNWIND_HINT_TYPE_UNDEFINED
>>>> +.endm
>>> We don't need to set sp_reg=ORC_REG_UNDEFINED for UNWIND_HINT_UNDEFINED?
>> Yes, no need to set sp_reg, the instructions marked with UNDEFINED
>> are blind spots in ORC coverage, it is no related with stack trace,
>> this is similar with x86.
>>
>>>> +
>>>> +.macro UNWIND_HINT_EMPTY
>>>> + UNWIND_HINT sp_reg=ORC_REG_UNDEFINED type=UNWIND_HINT_TYPE_CALL
>>>> +.endm
>>> We don't need to define UNWIND_HINT_END_OF_STACK?
>> Yes, it is useless now.
>>
>>>> +
>>>> +.macro UNWIND_HINT_REGS
>>>> + UNWIND_HINT sp_reg=ORC_REG_SP type=UNWIND_HINT_TYPE_REGS
>>>> +.endm
>>>> +
>>>> +.macro UNWIND_HINT_FUNC
>>>> + UNWIND_HINT sp_reg=ORC_REG_SP type=UNWIND_HINT_TYPE_CALL
>>>> +.endm
>>> We don't need to set sp_offset for UNWIND_HINT_REGS and UNWIND_HINT_FUNC?
>> sp_offset is 0 by default, no need to set it unless you need to change
>> its value, see include/linux/objtool.h
>> .macro UNWIND_HINT type:req sp_reg=0 sp_offset=0 signal=0
>>
>>>> +
>>>> +#endif /* __ASSEMBLY__ */
>> ...
>>
>>>> diff --git a/arch/loongarch/kernel/entry.S b/arch/loongarch/kernel/entry.S
>>>> index 65518bb..e43115f 100644
>>>> --- a/arch/loongarch/kernel/entry.S
>>>> +++ b/arch/loongarch/kernel/entry.S
>>>> @@ -14,11 +14,13 @@
>>>> #include <asm/regdef.h>
>>>> #include <asm/stackframe.h>
>>>> #include <asm/thread_info.h>
>>>> +#include <asm/unwind_hints.h>
>>>>
>>>> .text
>>>> .cfi_sections .debug_frame
>>>> .align 5
>>>> -SYM_FUNC_START(handle_syscall)
>>>> +SYM_CODE_START(handle_syscall)
>>> Why?
>>>
>> see include/linux/linkage.h
>> FUNC -- C-like functions (proper stack frame etc.)
>> CODE -- non-C code (e.g. irq handlers with different, special stack etc.)
> Hi, Jinyang,
>
> What do you think about it? In our internal repo, most asm functions
> changed in this patch are still marked with FUNC, not CODE.

Hi, Huacai,


As the anotations in the include/linux/linkage.h, CODE should be used for
exception handler in case where the stack at the start of the handler
is unbalanced with the stack at the exit. In validate_branch,
validate_return, and validate_sibling_call it will not check the stack.
CODE needs HINT to describe the actual stack at the beginning of the CODE.

In objtool's check flow, then entry check FUNC is validate_functions and
the entry of check CODE is validate_unwind_hints. They actual check function
is validate_branch. If ignore the stack check, they can get the same ORC
info in most cases. In the internal repo, limited by what I knew about
objtool
at that time, I might have done something wrong.  e.g. NOT_SIBLING_CALL_HINT
could be a way to ignore stack checks. These exception handler code logic in
upstream is cleaner than that in the internal repo. So I hope this can be
fixed in upstream first.

handle_syscall is an example of a FUNC that looks stack balanced. However,
the RA register at the entry is not the real RA, and its SP is also changed
from user stack SP to kernel stack SP. So in fact, it is not stack balanced.
It needs to be marked as CODE, and annotate HINT at the CODE entry to
describe the actual stack (, usually described as undefined).

In short, objtool is strictly dependent on canonical codes so that it can
get the ORC information right.

Thanks,

Jinyang


>
>>>> + UNWIND_HINT_UNDEFINED
>>>> csrrd t0, PERCPU_BASE_KS
>> ...
>>
>>>> diff --git a/arch/loongarch/kernel/head.S b/arch/loongarch/kernel/head.S
>>>> index 53b883d..5664390 100644
>>>> --- a/arch/loongarch/kernel/head.S
>>>> +++ b/arch/loongarch/kernel/head.S
>>>> @@ -43,6 +43,7 @@ SYM_DATA(kernel_offset, .long _kernel_offset);
>>>> .align 12
>>>>
>>>> SYM_CODE_START(kernel_entry) # kernel entry point
>>>> + UNWIND_HINT_EMPTY
>>> I'm not sure but I think this isn't needed, because
>>> "OBJECT_FILES_NON_STANDARD_head.o :=y"
>> Yes, you are right, will remove it.
>>
>>>> /* Config direct window and set PG */
>> ...
>>
>>>> void __init setup_arch(char **cmdline_p)
>>>> {
>>>> + unwind_init();
>>> I think this line should be after cpu_probe().
>> I am OK to do this change, but if so, there are no stack trace before
>> cpu_probe() for the early code.
> As I said before, stack trace needs printk, but printk cannot work
> before cpu_probe().
>
>>>> cpu_probe();
>>>>
>>>> init_environ();
>> ...
>>
>>>> diff --git a/arch/loongarch/power/Makefile b/arch/loongarch/power/Makefile
>>>> index 58151d0..bbd1d47 100644
>>>> --- a/arch/loongarch/power/Makefile
>>>> +++ b/arch/loongarch/power/Makefile
>>>> @@ -1,3 +1,5 @@
>>>> +OBJECT_FILES_NON_STANDARD_suspend_asm.o := y
>>> hibernate_asm.o has no problem?
>> Yes, only suspend_asm.o has one warning, just ignore it.
> What kind of warning? When I submitted the suspend patch, Jinyang told
> me that with his changes loongarch_suspend_enter() can be a regular
> function.
>
> Huacai

Hi, Tiezhu,

We can think the jirl with link register is a call instruction.
loongarch_suspend_enter:
    jirl   a0, t0, 0 /* Call BIOS's STR sleep routine */
Its link register is a0, (not ra), we also think it is a call
instruction. The func is also stack banlaced. So the func can be a
regular function.

Thanks,

Jinyang


>> Thanks,
>> Tiezhu
>>
>>

2023-10-15 13:58:42

by Huacai Chen

[permalink] [raw]
Subject: Re: [PATCH v2 8/8] LoongArch: Add ORC unwinder support

Hi, Jinyang,

On Sun, Oct 15, 2023 at 8:58 PM Jinyang He <[email protected]> wrote:
>
> On 2023-10-14 19:37, Huacai Chen wrote:
>
> > +CC Jinyang
> >
> > On Sat, Oct 14, 2023 at 5:21 PM Tiezhu Yang <[email protected]> wrote:
> >>
> >>
> >> On 10/11/2023 12:37 PM, Huacai Chen wrote:
> >>> Hi, Tiezhu,
> >>>
> >>> Maybe "LoongArch: Add ORC stack unwinder support" is better.
> >> OK, will modify it.
> >>
> >>> On Mon, Oct 9, 2023 at 9:03 PM Tiezhu Yang <[email protected]> wrote:
> >>>> The kernel CONFIG_UNWINDER_ORC option enables the ORC unwinder, which is
> >>>> similar in concept to a DWARF unwinder. The difference is that the format
> >>>> of the ORC data is much simpler than DWARF, which in turn allows the ORC
> >>>> unwinder to be much simpler and faster.
> >> ...
> >>
> >>>> +ifdef CONFIG_OBJTOOL
> >>>> +# https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=ecb802d02eeb
> >>>> +# https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=816029e06768
> >>>> +ifeq ($(shell as --help 2>&1 | grep -e '-mthin-add-sub'),)
> >>>> + $(error Sorry, you need a newer gas version with -mthin-add-sub option)
> >>> I prefer no error out here, because without this option we can still
> >>> built a runnable kernel.
> >> I agree with you that it is better to not error out to stop compilation,
> >> but there are many objtool warnings during the compile process with old
> >> binutils, so it is necessary to give a warning so that the users know
> >> what happened and how to fix the lots of objtool warnings.
> >>
> >> That is to say, I would prefer to replace "error" with "warning".
> >>
> >>>> +endif
> >>>> +KBUILD_AFLAGS += $(call cc-option,-mthin-add-sub) $(call cc-option,-Wa$(comma)-mthin-add-sub)
> >>>> +KBUILD_CFLAGS += $(call cc-option,-mthin-add-sub) $(call cc-option,-Wa$(comma)-mthin-add-sub)
> >>>> +KBUILD_CFLAGS += -fno-optimize-sibling-calls -fno-jump-tables -falign-functions=4
> >>>> +endif
> >> ...
> >>
> >>>> +#define ORC_REG_BP 3
> >>> Use FP instead of BP in this patch, too.
> >> OK, will do it.
> >>
> >>>> +#define ORC_REG_MAX 4
> >> ...
> >>
> >>>> +.macro UNWIND_HINT_UNDEFINED
> >>>> + UNWIND_HINT type=UNWIND_HINT_TYPE_UNDEFINED
> >>>> +.endm
> >>> We don't need to set sp_reg=ORC_REG_UNDEFINED for UNWIND_HINT_UNDEFINED?
> >> Yes, no need to set sp_reg, the instructions marked with UNDEFINED
> >> are blind spots in ORC coverage, it is no related with stack trace,
> >> this is similar with x86.
> >>
> >>>> +
> >>>> +.macro UNWIND_HINT_EMPTY
> >>>> + UNWIND_HINT sp_reg=ORC_REG_UNDEFINED type=UNWIND_HINT_TYPE_CALL
> >>>> +.endm
> >>> We don't need to define UNWIND_HINT_END_OF_STACK?
> >> Yes, it is useless now.
> >>
> >>>> +
> >>>> +.macro UNWIND_HINT_REGS
> >>>> + UNWIND_HINT sp_reg=ORC_REG_SP type=UNWIND_HINT_TYPE_REGS
> >>>> +.endm
> >>>> +
> >>>> +.macro UNWIND_HINT_FUNC
> >>>> + UNWIND_HINT sp_reg=ORC_REG_SP type=UNWIND_HINT_TYPE_CALL
> >>>> +.endm
> >>> We don't need to set sp_offset for UNWIND_HINT_REGS and UNWIND_HINT_FUNC?
> >> sp_offset is 0 by default, no need to set it unless you need to change
> >> its value, see include/linux/objtool.h
> >> .macro UNWIND_HINT type:req sp_reg=0 sp_offset=0 signal=0
> >>
> >>>> +
> >>>> +#endif /* __ASSEMBLY__ */
> >> ...
> >>
> >>>> diff --git a/arch/loongarch/kernel/entry.S b/arch/loongarch/kernel/entry.S
> >>>> index 65518bb..e43115f 100644
> >>>> --- a/arch/loongarch/kernel/entry.S
> >>>> +++ b/arch/loongarch/kernel/entry.S
> >>>> @@ -14,11 +14,13 @@
> >>>> #include <asm/regdef.h>
> >>>> #include <asm/stackframe.h>
> >>>> #include <asm/thread_info.h>
> >>>> +#include <asm/unwind_hints.h>
> >>>>
> >>>> .text
> >>>> .cfi_sections .debug_frame
> >>>> .align 5
> >>>> -SYM_FUNC_START(handle_syscall)
> >>>> +SYM_CODE_START(handle_syscall)
> >>> Why?
> >>>
> >> see include/linux/linkage.h
> >> FUNC -- C-like functions (proper stack frame etc.)
> >> CODE -- non-C code (e.g. irq handlers with different, special stack etc.)
> > Hi, Jinyang,
> >
> > What do you think about it? In our internal repo, most asm functions
> > changed in this patch are still marked with FUNC, not CODE.
>
> Hi, Huacai,
>
>
> As the anotations in the include/linux/linkage.h, CODE should be used for
> exception handler in case where the stack at the start of the handler
> is unbalanced with the stack at the exit. In validate_branch,
> validate_return, and validate_sibling_call it will not check the stack.
> CODE needs HINT to describe the actual stack at the beginning of the CODE.
>
> In objtool's check flow, then entry check FUNC is validate_functions and
> the entry of check CODE is validate_unwind_hints. They actual check function
> is validate_branch. If ignore the stack check, they can get the same ORC
> info in most cases. In the internal repo, limited by what I knew about
> objtool
> at that time, I might have done something wrong. e.g. NOT_SIBLING_CALL_HINT
> could be a way to ignore stack checks. These exception handler code logic in
> upstream is cleaner than that in the internal repo. So I hope this can be
> fixed in upstream first.
>
> handle_syscall is an example of a FUNC that looks stack balanced. However,
> the RA register at the entry is not the real RA, and its SP is also changed
> from user stack SP to kernel stack SP. So in fact, it is not stack balanced.
> It needs to be marked as CODE, and annotate HINT at the CODE entry to
> describe the actual stack (, usually described as undefined).
>
> In short, objtool is strictly dependent on canonical codes so that it can
> get the ORC information right.
Is the code in tlbex.S the same as handle_syscall()? If so, I suggest
submitting a separate patch to rename FUNC to CODE. That will be easy
to review, and can be upstream earlier because it is independent with
objtool.

Huacai

>
> Thanks,
>
> Jinyang
>
>
> >
> >>>> + UNWIND_HINT_UNDEFINED
> >>>> csrrd t0, PERCPU_BASE_KS
> >> ...
> >>
> >>>> diff --git a/arch/loongarch/kernel/head.S b/arch/loongarch/kernel/head.S
> >>>> index 53b883d..5664390 100644
> >>>> --- a/arch/loongarch/kernel/head.S
> >>>> +++ b/arch/loongarch/kernel/head.S
> >>>> @@ -43,6 +43,7 @@ SYM_DATA(kernel_offset, .long _kernel_offset);
> >>>> .align 12
> >>>>
> >>>> SYM_CODE_START(kernel_entry) # kernel entry point
> >>>> + UNWIND_HINT_EMPTY
> >>> I'm not sure but I think this isn't needed, because
> >>> "OBJECT_FILES_NON_STANDARD_head.o :=y"
> >> Yes, you are right, will remove it.
> >>
> >>>> /* Config direct window and set PG */
> >> ...
> >>
> >>>> void __init setup_arch(char **cmdline_p)
> >>>> {
> >>>> + unwind_init();
> >>> I think this line should be after cpu_probe().
> >> I am OK to do this change, but if so, there are no stack trace before
> >> cpu_probe() for the early code.
> > As I said before, stack trace needs printk, but printk cannot work
> > before cpu_probe().
> >
> >>>> cpu_probe();
> >>>>
> >>>> init_environ();
> >> ...
> >>
> >>>> diff --git a/arch/loongarch/power/Makefile b/arch/loongarch/power/Makefile
> >>>> index 58151d0..bbd1d47 100644
> >>>> --- a/arch/loongarch/power/Makefile
> >>>> +++ b/arch/loongarch/power/Makefile
> >>>> @@ -1,3 +1,5 @@
> >>>> +OBJECT_FILES_NON_STANDARD_suspend_asm.o := y
> >>> hibernate_asm.o has no problem?
> >> Yes, only suspend_asm.o has one warning, just ignore it.
> > What kind of warning? When I submitted the suspend patch, Jinyang told
> > me that with his changes loongarch_suspend_enter() can be a regular
> > function.
> >
> > Huacai
>
> Hi, Tiezhu,
>
> We can think the jirl with link register is a call instruction.
> loongarch_suspend_enter:
> jirl a0, t0, 0 /* Call BIOS's STR sleep routine */
> Its link register is a0, (not ra), we also think it is a call
> instruction. The func is also stack banlaced. So the func can be a
> regular function.
>
> Thanks,
>
> Jinyang
>
>
> >> Thanks,
> >> Tiezhu
> >>
> >>
>
>

2023-10-17 12:33:43

by Tiezhu Yang

[permalink] [raw]
Subject: Re: [PATCH v2 8/8] LoongArch: Add ORC unwinder support



On 10/14/2023 07:40 PM, Huacai Chen wrote:
> On Mon, Oct 9, 2023 at 9:03 PM Tiezhu Yang <[email protected]> wrote:
>>
>> The kernel CONFIG_UNWINDER_ORC option enables the ORC unwinder, which is
>> similar in concept to a DWARF unwinder. The difference is that the format
>> of the ORC data is much simpler than DWARF, which in turn allows the ORC
>> unwinder to be much simpler and faster.

...

>> diff --git a/arch/loongarch/configs/loongson3_defconfig b/arch/loongarch/configs/loongson3_defconfig
>> index a3b52aa..de911c3 100644
>> --- a/arch/loongarch/configs/loongson3_defconfig
>> +++ b/arch/loongarch/configs/loongson3_defconfig
>> @@ -5,6 +5,7 @@ CONFIG_NO_HZ=y
>> CONFIG_HIGH_RES_TIMERS=y
>> CONFIG_BPF_SYSCALL=y
>> CONFIG_BPF_JIT=y
>> +CONFIG_BPF_JIT_ALWAYS_ON=y
> Does BPF have something to do with ORC?

This is to avoid the following warning:

CC kernel/bpf/core.o
{standard input}: Assembler messages:
{standard input}:10805: Warning: setting incorrect section attributes
for .rodata..c_jump_table
kernel/bpf/core.o: warning: objtool: ___bpf_prog_run+0x44: sibling call
from callable instruction with modified stack frame

Because -fno-jump-tables is specified in arch/loongarch/Makefile
for now, but __annotate_jump_table is used in ___bpf_prog_run().

#ifndef CONFIG_BPF_JIT_ALWAYS_ON
...
static u64 ___bpf_prog_run(u64 *regs, const struct bpf_insn *insn)
{
...
static const void * const jumptable[256] __annotate_jump_table = {
...
}
#endif

I think we can remove CONFIG_BPF_JIT_ALWAYS_ON in defconfig now
due to the warning is harmless.

Thanks,
Tiezhu

2023-10-17 12:37:25

by Tiezhu Yang

[permalink] [raw]
Subject: Re: [PATCH v2 8/8] LoongArch: Add ORC unwinder support



On 10/14/2023 07:37 PM, Huacai Chen wrote:
> +CC Jinyang
>
> On Sat, Oct 14, 2023 at 5:21 PM Tiezhu Yang <[email protected]> wrote:

...

>>>> diff --git a/arch/loongarch/power/Makefile b/arch/loongarch/power/Makefile
>>>> index 58151d0..bbd1d47 100644
>>>> --- a/arch/loongarch/power/Makefile
>>>> +++ b/arch/loongarch/power/Makefile
>>>> @@ -1,3 +1,5 @@
>>>> +OBJECT_FILES_NON_STANDARD_suspend_asm.o := y
>>> hibernate_asm.o has no problem?
>>
>> Yes, only suspend_asm.o has one warning, just ignore it.
> What kind of warning? When I submitted the suspend patch, Jinyang told
> me that with his changes loongarch_suspend_enter() can be a regular
> function.

Like this:

AS arch/loongarch/power/suspend_asm.o
arch/loongarch/power/suspend_asm.o: warning: objtool:
loongarch_suspend_enter+0x6c: unreachable instruction

[fedora@linux 6.6.test]$ objdump -M no-aliases -D
arch/loongarch/power/suspend_asm.o
0000000000000ffc <loongarch_suspend_enter>:
ffc: 02fb0063 addi.d $sp, $sp, -320
1000: 29c02061 st.d $ra, $sp, 8
1004: 29c04062 st.d $tp, $sp, 16
1008: 29c06063 st.d $sp, $sp, 24
100c: 29c08064 st.d $a0, $sp, 32
1010: 29c2a075 st.d $r21, $sp, 168
1014: 29c2c076 st.d $fp, $sp, 176
1018: 29c2e077 st.d $s0, $sp, 184
101c: 29c30078 st.d $s1, $sp, 192
1020: 29c32079 st.d $s2, $sp, 200
1024: 29c3407a st.d $s3, $sp, 208
1028: 29c3607b st.d $s4, $sp, 216
102c: 29c3807c st.d $s5, $sp, 224
1030: 29c3a07d st.d $s6, $sp, 232
1034: 29c3c07e st.d $s7, $sp, 240
1038: 29c3e07f st.d $s8, $sp, 248
103c: 1a00000c pcalau12i $t0, 0
1040: 02c0018c addi.d $t0, $t0, 0
1044: 29c00183 st.d $sp, $t0, 0
1048: 54000000 bl 0 # 1048 <loongarch_suspend_enter+0x4c>
104c: 02c00065 addi.d $a1, $sp, 0
1050: 1a000004 pcalau12i $a0, 0
1054: 02c00084 addi.d $a0, $a0, 0
1058: 1a00000c pcalau12i $t0, 0
105c: 02c0018c addi.d $t0, $t0, 0
1060: 28c0018c ld.d $t0, $t0, 0
1064: 4c000184 jirl $a0, $t0, 0

0000000000001068 <loongarch_wakeup_start>:
1068: 0380040c ori $t0, $zero, 0x1
106c: 0320018c lu52i.d $t0, $t0, -2048
1070: 0406002c csrwr $t0, 0x180
1074: 0380440c ori $t0, $zero, 0x11
1078: 0324018c lu52i.d $t0, $t0, -1792
107c: 0406042c csrwr $t0, 0x181
1080: 0324000c lu52i.d $t0, $zero, -1792
1084: 1800000d pcaddi $t1, 0
1088: 0015358c or $t0, $t0, $t1
108c: 4c000d80 jirl $zero, $t0, 12
1090: 0382c00c ori $t0, $zero, 0xb0
1094: 0400002c csrwr $t0, 0x0
1098: 1a00000c pcalau12i $t0, 0
109c: 02c0018c addi.d $t0, $t0, 0
10a0: 28c00183 ld.d $sp, $t0, 0
10a4: 28c02061 ld.d $ra, $sp, 8
10a8: 28c04062 ld.d $tp, $sp, 16
10ac: 28c06063 ld.d $sp, $sp, 24
10b0: 28c08064 ld.d $a0, $sp, 32
10b4: 28c2a075 ld.d $r21, $sp, 168
10b8: 28c2c076 ld.d $fp, $sp, 176
10bc: 28c2e077 ld.d $s0, $sp, 184
10c0: 28c30078 ld.d $s1, $sp, 192
10c4: 28c32079 ld.d $s2, $sp, 200
10c8: 28c3407a ld.d $s3, $sp, 208
10cc: 28c3607b ld.d $s4, $sp, 216
10d0: 28c3807c ld.d $s5, $sp, 224
10d4: 28c3a07d ld.d $s6, $sp, 232
10d8: 28c3c07e ld.d $s7, $sp, 240
10dc: 28c3e07f ld.d $s8, $sp, 248
10e0: 02c50063 addi.d $sp, $sp, 320
10e4: 4c000020 jirl $zero, $ra, 0

It need to modify jirl decoder to handle the following instruction:
1064: 4c000184 jirl $a0, $t0, 0

Thanks,
Tiezhu

2023-10-17 12:39:48

by Tiezhu Yang

[permalink] [raw]
Subject: Re: [PATCH v2 8/8] LoongArch: Add ORC unwinder support



On 10/15/2023 08:57 PM, Jinyang He wrote:
> On 2023-10-14 19:37, Huacai Chen wrote:

...

>
> Hi, Tiezhu,
>
> We can think the jirl with link register is a call instruction.
> loongarch_suspend_enter:
> jirl a0, t0, 0 /* Call BIOS's STR sleep routine */
> Its link register is a0, (not ra), we also think it is a call
> instruction. The func is also stack banlaced. So the func can be a
> regular function.

Yes, thank you, I will modify jirl decoder to handle this special case
in hibernate_asm.o.

Thanks,
Tiezhu

2023-10-17 12:45:06

by Tiezhu Yang

[permalink] [raw]
Subject: Re: [PATCH v2 8/8] LoongArch: Add ORC unwinder support



On 10/15/2023 09:57 PM, Huacai Chen wrote:
> Hi, Jinyang,

...

>> In short, objtool is strictly dependent on canonical codes so that it can
>> get the ORC information right.
> Is the code in tlbex.S the same as handle_syscall()? If so, I suggest
> submitting a separate patch to rename FUNC to CODE. That will be easy
> to review, and can be upstream earlier because it is independent with
> objtool.

Good suggestion, thank you, I have submitted a single patch to do this.

https://lore.kernel.org/loongarch/[email protected]/

Thanks,
Tiezhu

2023-10-18 11:45:51

by Tiezhu Yang

[permalink] [raw]
Subject: Re: [PATCH v2 8/8] LoongArch: Add ORC unwinder support



On 10/10/2023 09:46 PM, kernel test robot wrote:
> Hi Tiezhu,
>
> kernel test robot noticed the following build errors:

...

> In file included from scripts/sorttable.c:201:
>>> scripts/sorttable.h:96:10: fatal error: 'asm/orc_types.h' file not found
> #include <asm/orc_types.h>
> ^~~~~~~~~~~~~~~~~
> 1 error generated.

Thanks for your report, I can reproduce the error on x86:

make ARCH=x86_64 olddefconfig
make ARCH=x86_64 prepare

The build error is related with "ARCH=x86_64", there is no error
without "ARCH=x86_64" when build locally on x86.

As described in Documentation/kbuild/makefiles.rst:

For example, you can pass in ARCH=i386, ARCH=x86_64, or ARCH=x86.
For all of them, SRCARCH=x86 because arch/x86/ supports both i386 and
x86_64.

we should use SRCARCH instead of ARCH to specify the directory,
like this:

diff --git a/scripts/Makefile b/scripts/Makefile
index 576cf64..e4cca53 100644
--- a/scripts/Makefile
+++ b/scripts/Makefile
@@ -31,9 +31,12 @@ HOSTLDLIBS_sign-file = $(shell $(HOSTPKG_CONFIG)
--libs libcrypto 2> /dev/null |

ifdef CONFIG_UNWINDER_ORC
ifeq ($(ARCH),x86_64)
-ARCH := x86
+SRCARCH := x86
endif
-HOSTCFLAGS_sorttable.o += -I$(srctree)/tools/arch/x86/include
+ifeq ($(ARCH),loongarch)
+SRCARCH := loongarch
+endif
+HOSTCFLAGS_sorttable.o += -I$(srctree)/tools/arch/$(SRCARCH)/include
HOSTCFLAGS_sorttable.o += -DUNWINDER_ORC_ENABLED
endif

I will update the related code of this patch in the next version.

Thanks,
Tiezhu