2021-01-20 19:00:41

by Julien Thierry

[permalink] [raw]
Subject: [RFC PATCH 00/17] objtool: add base support for arm64

Hi,

This series enables objtool to start doing stack validation on arm64
kernel builds. It relies on the previous series I sent, refactoring
the arm64 decoder [1].

First, the aarch64 instruction decoder needed to be made available
to code under tools/. This is done in a similar manner to x86
instruction decoded. One limitation I encountered there is that most
of aarch64 instruction decoder is __kprobe annotated. To bypass that
it remove the kprobe include and had to add an empty __kprobe
definition, but I'd welcome a proper solution to that.

Then instruction semantics are progressively added so objtool can track
the stack state through the execution flow.
There are a few things that needed consideration:
- Generation of constants within executable sections, these either
caused objtool to fail decoding or to wrongly decode constants
as jumps or other instructions affecting execution flow and
causing confusion. To solve this, tracking locations referenced
by instructions using literals was needed.
- Jump tables from switch statements in aarch64 don't have enough
information to link branches with the branch instruction leading to
them. For this, we use a gcc plugin to add some information to establish
those missing links in a format that is already supported by objtool

With this, there are still some errors when building with objtool. A
number of cleanups/annotations are needed on the arm64, as well as
handling SYM_DATA objects in objtool.

Those changes can be found on top of this branch here:
git clone https://github.com/julien-thierry/linux.git -b objtoolxarm64-latest

But it would be nice to have some feedback on this before I start submitting everyting.

[1] https://lkml.org/lkml/2021/1/20/791

Thanks,

Julien

-->

Julien Thierry (15):
tools: Add some generic functions and headers
tools: arm64: Make aarch64 instruction decoder available to tools
tools: bug: Remove duplicate definition
objtool: arm64: Add base definition for arm64 backend
objtool: arm64: Decode add/sub instructions
objtool: arm64: Decode jump and call related instructions
objtool: arm64: Decode other system instructions
objtool: arm64: Decode load/store instructions
objtool: arm64: Decode LDR instructions
objtool: arm64: Accept padding in code sections
efi: libstub: Ignore relocations for .discard sections
objtool: arm64: Implement functions to add switch tables alternatives
objtool: arm64: Cache section with switch table information
objtool: arm64: Handle supported relocations in alternatives
objtool: arm64: Ignore replacement section for alternative callback

Raphael Gault (2):
gcc-plugins: objtool: Add plugin to detect switch table on arm64
objtool: arm64: Enable stack validation for arm64

arch/arm64/Kconfig | 2 +
drivers/firmware/efi/libstub/Makefile | 2 +-
scripts/Makefile.gcc-plugins | 2 +
scripts/gcc-plugins/Kconfig | 4 +
.../arm64_switch_table_detection_plugin.c | 85 +
tools/arch/arm64/include/asm/aarch64-insn.h | 551 +++++++
tools/arch/arm64/lib/aarch64-insn.c | 1425 +++++++++++++++++
tools/include/asm-generic/bitops/__ffs.h | 11 +
tools/include/linux/bug.h | 6 +-
tools/include/linux/kernel.h | 21 +
tools/include/linux/printk.h | 40 +
tools/objtool/Makefile | 5 +
tools/objtool/arch/arm64/Build | 8 +
tools/objtool/arch/arm64/decode.c | 471 ++++++
.../arch/arm64/include/arch/cfi_regs.h | 14 +
tools/objtool/arch/arm64/include/arch/elf.h | 6 +
.../arch/arm64/include/arch/endianness.h | 9 +
.../objtool/arch/arm64/include/arch/special.h | 23 +
tools/objtool/arch/arm64/special.c | 134 ++
tools/objtool/arch/x86/decode.c | 5 +
tools/objtool/check.c | 6 +
tools/objtool/include/objtool/arch.h | 3 +
tools/objtool/sync-check.sh | 5 +
23 files changed, 2832 insertions(+), 6 deletions(-)
create mode 100644 scripts/gcc-plugins/arm64_switch_table_detection_plugin.c
create mode 100644 tools/arch/arm64/include/asm/aarch64-insn.h
create mode 100644 tools/arch/arm64/lib/aarch64-insn.c
create mode 100644 tools/include/linux/printk.h
create mode 100644 tools/objtool/arch/arm64/Build
create mode 100644 tools/objtool/arch/arm64/decode.c
create mode 100644 tools/objtool/arch/arm64/include/arch/cfi_regs.h
create mode 100644 tools/objtool/arch/arm64/include/arch/elf.h
create mode 100644 tools/objtool/arch/arm64/include/arch/endianness.h
create mode 100644 tools/objtool/arch/arm64/include/arch/special.h
create mode 100644 tools/objtool/arch/arm64/special.c

--
2.25.4


2021-01-20 19:13:17

by Julien Thierry

[permalink] [raw]
Subject: [RFC PATCH 06/17] objtool: arm64: Decode jump and call related instructions

Decode branch, branch and link (aarch64's call) and return instructions.

Signed-off-by: Julien Thierry <[email protected]>
---
tools/objtool/arch/arm64/decode.c | 22 ++++++++++++++++++++++
1 file changed, 22 insertions(+)

diff --git a/tools/objtool/arch/arm64/decode.c b/tools/objtool/arch/arm64/decode.c
index 0f312dd1b146..924121b4b466 100644
--- a/tools/objtool/arch/arm64/decode.c
+++ b/tools/objtool/arch/arm64/decode.c
@@ -205,6 +205,28 @@ int arch_decode_instruction(const struct elf *elf, const struct section *sec,
}
*type = INSN_OTHER;
break;
+ case AARCH64_INSN_CLS_BR_SYS:
+ if (aarch64_insn_is_ret(insn) &&
+ aarch64_insn_decode_register(AARCH64_INSN_REGTYPE_RN, insn) == AARCH64_INSN_REG_LR) {
+ *type = INSN_RETURN;
+ } else if (aarch64_insn_is_bl(insn)) {
+ *type = INSN_CALL;
+ *immediate = aarch64_get_branch_offset(insn);
+ } else if (aarch64_insn_is_blr(insn)) {
+ *type = INSN_CALL_DYNAMIC;
+ } else if (aarch64_insn_is_b(insn)) {
+ *type = INSN_JUMP_UNCONDITIONAL;
+ *immediate = aarch64_get_branch_offset(insn);
+ } else if (aarch64_insn_is_br(insn)) {
+ *type = INSN_JUMP_DYNAMIC;
+ } else if (aarch64_insn_is_branch_imm(insn)) {
+ /* Remaining branch opcodes are conditional */
+ *type = INSN_JUMP_CONDITIONAL;
+ *immediate = aarch64_get_branch_offset(insn);
+ } else {
+ *type = INSN_OTHER;
+ }
+ break;
default:
*type = INSN_OTHER;
break;
--
2.25.4

2021-01-20 19:13:30

by Julien Thierry

[permalink] [raw]
Subject: [RFC PATCH 04/17] objtool: arm64: Add base definition for arm64 backend

Provide needed definitions for a new architecture instruction decoder.
No proper decoding is done yet.

Signed-off-by: Julien Thierry <[email protected]>
---
tools/objtool/Makefile | 5 +
tools/objtool/arch/arm64/Build | 8 ++
tools/objtool/arch/arm64/decode.c | 130 ++++++++++++++++++
.../arch/arm64/include/arch/cfi_regs.h | 14 ++
tools/objtool/arch/arm64/include/arch/elf.h | 6 +
.../arch/arm64/include/arch/endianness.h | 9 ++
.../objtool/arch/arm64/include/arch/special.h | 21 +++
tools/objtool/arch/arm64/special.c | 21 +++
tools/objtool/sync-check.sh | 5 +
9 files changed, 219 insertions(+)
create mode 100644 tools/objtool/arch/arm64/Build
create mode 100644 tools/objtool/arch/arm64/decode.c
create mode 100644 tools/objtool/arch/arm64/include/arch/cfi_regs.h
create mode 100644 tools/objtool/arch/arm64/include/arch/elf.h
create mode 100644 tools/objtool/arch/arm64/include/arch/endianness.h
create mode 100644 tools/objtool/arch/arm64/include/arch/special.h
create mode 100644 tools/objtool/arch/arm64/special.c

diff --git a/tools/objtool/Makefile b/tools/objtool/Makefile
index 92ce4fce7bc7..d5cfbec87c02 100644
--- a/tools/objtool/Makefile
+++ b/tools/objtool/Makefile
@@ -47,6 +47,11 @@ ifeq ($(SRCARCH),x86)
SUBCMD_ORC := y
endif

+ifeq ($(SRCARCH),arm64)
+ SUBCMD_CHECK := y
+ CFLAGS += -Wno-nested-externs
+endif
+
export SUBCMD_CHECK SUBCMD_ORC
export srctree OUTPUT CFLAGS SRCARCH AWK
include $(srctree)/tools/build/Makefile.include
diff --git a/tools/objtool/arch/arm64/Build b/tools/objtool/arch/arm64/Build
new file mode 100644
index 000000000000..f3de3a50d541
--- /dev/null
+++ b/tools/objtool/arch/arm64/Build
@@ -0,0 +1,8 @@
+objtool-y += special.o
+objtool-y += decode.o
+
+objtool-y += libhweight.o
+
+$(OUTPUT)arch/arm64/libhweight.o: ../lib/hweight.c FORCE
+ $(call rule_mkdir)
+ $(call if_changed_dep,cc_o_c)
diff --git a/tools/objtool/arch/arm64/decode.c b/tools/objtool/arch/arm64/decode.c
new file mode 100644
index 000000000000..8ae822f553ca
--- /dev/null
+++ b/tools/objtool/arch/arm64/decode.c
@@ -0,0 +1,130 @@
+// SPDX-License-Identifier: GPL-2.0-or-later
+
+#include <stdio.h>
+#include <stdlib.h>
+#include <stdint.h>
+
+/* Hack needed to avoid depending on brk-imm.h */
+#define FAULT_BRK_IMM 0x100
+
+#include <asm/aarch64-insn.h>
+
+#include <objtool/check.h>
+#include <objtool/arch.h>
+#include <objtool/elf.h>
+#include <objtool/warn.h>
+
+#include <arch/cfi_regs.h>
+
+/* Hack needed to avoid depending on kprobes.h */
+#ifndef __kprobes
+#define __kprobes
+#endif
+
+#include "../../../arch/arm64/lib/aarch64-insn.c"
+
+bool arch_callee_saved_reg(unsigned char reg)
+{
+ switch (reg) {
+ case AARCH64_INSN_REG_19:
+ case AARCH64_INSN_REG_20:
+ case AARCH64_INSN_REG_21:
+ case AARCH64_INSN_REG_22:
+ case AARCH64_INSN_REG_23:
+ case AARCH64_INSN_REG_24:
+ case AARCH64_INSN_REG_25:
+ case AARCH64_INSN_REG_26:
+ case AARCH64_INSN_REG_27:
+ case AARCH64_INSN_REG_28:
+ case AARCH64_INSN_REG_FP:
+ case AARCH64_INSN_REG_LR:
+ return true;
+ default:
+ return false;
+ }
+}
+
+void arch_initial_func_cfi_state(struct cfi_init_state *state)
+{
+ int i;
+
+ for (i = 0; i < CFI_NUM_REGS; i++) {
+ state->regs[i].base = CFI_UNDEFINED;
+ state->regs[i].offset = 0;
+ }
+
+ /* initial CFA (call frame address) */
+ state->cfa.base = CFI_SP;
+ state->cfa.offset = 0;
+}
+
+unsigned long arch_dest_reloc_offset(int addend)
+{
+ return addend;
+}
+
+unsigned long arch_jump_destination(struct instruction *insn)
+{
+ return insn->offset + insn->immediate;
+}
+
+const char *arch_nop_insn(int len)
+{
+ static u32 nop = 0;
+
+ if (len != AARCH64_INSN_SIZE)
+ WARN("invalid NOP size: %d\n", len);
+
+ if (!nop)
+ nop = aarch64_insn_gen_nop();
+
+ return (const char*)&nop;
+}
+
+static int is_arm64(const struct elf *elf)
+{
+ switch (elf->ehdr.e_machine) {
+ case EM_AARCH64: //0xB7
+ return 1;
+ default:
+ WARN("unexpected ELF machine type %x",
+ elf->ehdr.e_machine);
+ return 0;
+ }
+}
+
+int arch_decode_hint_reg(struct instruction *insn, u8 sp_reg)
+{
+ return -1;
+}
+
+int arch_decode_instruction(const struct elf *elf, const struct section *sec,
+ unsigned long offset, unsigned int maxlen,
+ unsigned int *len, enum insn_type *type,
+ unsigned long *immediate,
+ struct list_head *ops_list)
+{
+ u32 insn;
+
+ if (!is_arm64(elf))
+ return -1;
+
+ if (maxlen < AARCH64_INSN_SIZE)
+ return 0;
+
+ *len = AARCH64_INSN_SIZE;
+ *immediate = 0;
+
+ insn = *(u32 *)(sec->data->d_buf + offset);
+
+ switch (aarch64_get_insn_class(insn)) {
+ case AARCH64_INSN_CLS_UNKNOWN:
+ WARN("can't decode instruction at %s:0x%lx", sec->name, offset);
+ return -1;
+ default:
+ *type = INSN_OTHER;
+ break;
+ }
+
+ return 0;
+}
diff --git a/tools/objtool/arch/arm64/include/arch/cfi_regs.h b/tools/objtool/arch/arm64/include/arch/cfi_regs.h
new file mode 100644
index 000000000000..43ad56b6c3f9
--- /dev/null
+++ b/tools/objtool/arch/arm64/include/arch/cfi_regs.h
@@ -0,0 +1,14 @@
+/* SPDX-License-Identifier: GPL-2.0-or-later */
+
+#ifndef _OBJTOOL_CFI_REGS_H
+#define _OBJTOOL_CFI_REGS_H
+
+#include <asm/aarch64-insn.h>
+
+#define CFI_BP AARCH64_INSN_REG_FP
+#define CFI_RA AARCH64_INSN_REG_LR
+#define CFI_SP AARCH64_INSN_REG_SP
+
+#define CFI_NUM_REGS 32
+
+#endif /* _OBJTOOL_CFI_REGS_H */
diff --git a/tools/objtool/arch/arm64/include/arch/elf.h b/tools/objtool/arch/arm64/include/arch/elf.h
new file mode 100644
index 000000000000..a31a29b1a386
--- /dev/null
+++ b/tools/objtool/arch/arm64/include/arch/elf.h
@@ -0,0 +1,6 @@
+#ifndef _OBJTOOL_ARCH_ELF
+#define _OBJTOOL_ARCH_ELF
+
+#define R_NONE R_AARCH64_NONE
+
+#endif /* _OBJTOOL_ARCH_ELF */
diff --git a/tools/objtool/arch/arm64/include/arch/endianness.h b/tools/objtool/arch/arm64/include/arch/endianness.h
new file mode 100644
index 000000000000..7c362527da20
--- /dev/null
+++ b/tools/objtool/arch/arm64/include/arch/endianness.h
@@ -0,0 +1,9 @@
+/* SPDX-License-Identifier: GPL-2.0-or-later */
+#ifndef _ARCH_ENDIANNESS_H
+#define _ARCH_ENDIANNESS_H
+
+#include <endian.h>
+
+#define __TARGET_BYTE_ORDER __LITTLE_ENDIAN
+
+#endif /* _ARCH_ENDIANNESS_H */
diff --git a/tools/objtool/arch/arm64/include/arch/special.h b/tools/objtool/arch/arm64/include/arch/special.h
new file mode 100644
index 000000000000..a82a9b3e51df
--- /dev/null
+++ b/tools/objtool/arch/arm64/include/arch/special.h
@@ -0,0 +1,21 @@
+/* SPDX-License-Identifier: GPL-2.0-or-later */
+
+#ifndef _ARM64_ARCH_SPECIAL_H
+#define _ARM64_ARCH_SPECIAL_H
+
+#define EX_ENTRY_SIZE 8
+#define EX_ORIG_OFFSET 0
+#define EX_NEW_OFFSET 4
+
+#define JUMP_ENTRY_SIZE 16
+#define JUMP_ORIG_OFFSET 0
+#define JUMP_NEW_OFFSET 4
+
+#define ALT_ENTRY_SIZE 12
+#define ALT_ORIG_OFFSET 0
+#define ALT_NEW_OFFSET 4
+#define ALT_FEATURE_OFFSET 8
+#define ALT_ORIG_LEN_OFFSET 10
+#define ALT_NEW_LEN_OFFSET 11
+
+#endif /* _ARM64_ARCH_SPECIAL_H */
diff --git a/tools/objtool/arch/arm64/special.c b/tools/objtool/arch/arm64/special.c
new file mode 100644
index 000000000000..45f283283091
--- /dev/null
+++ b/tools/objtool/arch/arm64/special.c
@@ -0,0 +1,21 @@
+// SPDX-License-Identifier: GPL-2.0-or-later
+
+#include <objtool/special.h>
+
+void arch_handle_alternative(unsigned short feature, struct special_alt *alt)
+{
+}
+
+bool arch_support_alt_relocation(struct special_alt *special_alt,
+ struct instruction *insn,
+ struct reloc *reloc)
+{
+ return false;
+}
+
+
+struct reloc *arch_find_switch_table(struct objtool_file *file,
+ struct instruction *insn)
+{
+ return NULL;
+}
diff --git a/tools/objtool/sync-check.sh b/tools/objtool/sync-check.sh
index 606a4b5e929f..69e7ebe8911b 100755
--- a/tools/objtool/sync-check.sh
+++ b/tools/objtool/sync-check.sh
@@ -21,6 +21,11 @@ arch/x86/include/asm/insn.h -I '^#include [\"<]\(asm/\)*inat.h[\">]'
arch/x86/lib/inat.c -I '^#include [\"<]\(../include/\)*asm/insn.h[\">]'
arch/x86/lib/insn.c -I '^#include [\"<]\(../include/\)*asm/in\(at\|sn\).h[\">]' -I '^#include [\"<]\(../include/\)*asm/emulate_prefix.h[\">]'
"
+elif [ "$SRCARCH" = "arm64" ]; then
+FILES="$FILES
+arch/arm64/include/asm/aarch64-insn.h -I '^#include [\"<]\(asm/\)*brk-imm.h[\">]'
+arch/arm64/lib/aarch64-insn.c -I '^#include [\"<]\(asm/\)*kprobes.h[\">]'
+"
fi

check_2 () {
--
2.25.4

2021-01-21 09:12:40

by Ard Biesheuvel

[permalink] [raw]
Subject: Re: [RFC PATCH 00/17] objtool: add base support for arm64

Hello Julien,

On Wed, 20 Jan 2021 at 18:38, Julien Thierry <[email protected]> wrote:
>
> Hi,
>
> This series enables objtool to start doing stack validation on arm64
> kernel builds.

Could we elaborate on this point, please? 'Stack validation' means
getting an accurate picture of all kernel code that will be executed
at some point in the future, due to the fact that there are stack
frames pointing to them. And this ability is essential in order to do
live patching safely?

If this is the goal, I wonder whether this is the right approach for
arm64 (or for any other architecture, for that matter)

Parsing/decoding the object code and even worse, relying on GCC
plugins to annotate some of the idioms as they are being generated, in
order to infer intent on the part of the compiler goes *way* beyond
what we should be comfortable with. The whole point of this exercise
is to guarantee that there are no false positives when it comes to
deciding whether the kernel is in a live patchable state, and I don't
see how we can ever provide such a guarantee when it is built on such
a fragile foundation.

If we want to ensure that the stack contents are always an accurate
reflection of the real call stack, we should work with the toolchain
folks to identify issues that may interfere with this, and implement
controls over these behaviors that we can decide to use in the build.
In the past, I have already proposed adding a 'kernel' code model to
the AArch64 compiler that guarantees certain things, such as adrp/add
for symbol references, and no GOT indirections for position
independent code. Inhibiting optimizations that may impact our ability
to infer the real call stack from the stack contents is something we
might add here as well.

Another thing that occurred to me is that inferring which kernel code
is actually live in terms of pending function returns could be
inferred much more easily from a shadow call stack, which is a thing
we already implement for Clang builds.

In summary, I would not be in favor of enabling objtool on arm64 at
all until we have exhausted other options for providing the
functionality that we need it for (given that objtool provides many
other things that only x86 cares about, IIUC)

--
Ard.



> It relies on the previous series I sent, refactoring
> the arm64 decoder [1].
>
> First, the aarch64 instruction decoder needed to be made available
> to code under tools/. This is done in a similar manner to x86
> instruction decoded. One limitation I encountered there is that most
> of aarch64 instruction decoder is __kprobe annotated. To bypass that
> it remove the kprobe include and had to add an empty __kprobe
> definition, but I'd welcome a proper solution to that.
>
> Then instruction semantics are progressively added so objtool can track
> the stack state through the execution flow.
> There are a few things that needed consideration:
> - Generation of constants within executable sections, these either
> caused objtool to fail decoding or to wrongly decode constants
> as jumps or other instructions affecting execution flow and
> causing confusion. To solve this, tracking locations referenced
> by instructions using literals was needed.
> - Jump tables from switch statements in aarch64 don't have enough
> information to link branches with the branch instruction leading to
> them. For this, we use a gcc plugin to add some information to establish
> those missing links in a format that is already supported by objtool
>
> With this, there are still some errors when building with objtool. A
> number of cleanups/annotations are needed on the arm64, as well as
> handling SYM_DATA objects in objtool.
>
> Those changes can be found on top of this branch here:
> git clone https://github.com/julien-thierry/linux.git -b objtoolxarm64-latest
>
> But it would be nice to have some feedback on this before I start submitting everyting.
>
> [1] https://lkml.org/lkml/2021/1/20/791
>
> Thanks,
>
> Julien
>
> -->
>
> Julien Thierry (15):
> tools: Add some generic functions and headers
> tools: arm64: Make aarch64 instruction decoder available to tools
> tools: bug: Remove duplicate definition
> objtool: arm64: Add base definition for arm64 backend
> objtool: arm64: Decode add/sub instructions
> objtool: arm64: Decode jump and call related instructions
> objtool: arm64: Decode other system instructions
> objtool: arm64: Decode load/store instructions
> objtool: arm64: Decode LDR instructions
> objtool: arm64: Accept padding in code sections
> efi: libstub: Ignore relocations for .discard sections
> objtool: arm64: Implement functions to add switch tables alternatives
> objtool: arm64: Cache section with switch table information
> objtool: arm64: Handle supported relocations in alternatives
> objtool: arm64: Ignore replacement section for alternative callback
>
> Raphael Gault (2):
> gcc-plugins: objtool: Add plugin to detect switch table on arm64
> objtool: arm64: Enable stack validation for arm64
>
> arch/arm64/Kconfig | 2 +
> drivers/firmware/efi/libstub/Makefile | 2 +-
> scripts/Makefile.gcc-plugins | 2 +
> scripts/gcc-plugins/Kconfig | 4 +
> .../arm64_switch_table_detection_plugin.c | 85 +
> tools/arch/arm64/include/asm/aarch64-insn.h | 551 +++++++
> tools/arch/arm64/lib/aarch64-insn.c | 1425 +++++++++++++++++
> tools/include/asm-generic/bitops/__ffs.h | 11 +
> tools/include/linux/bug.h | 6 +-
> tools/include/linux/kernel.h | 21 +
> tools/include/linux/printk.h | 40 +
> tools/objtool/Makefile | 5 +
> tools/objtool/arch/arm64/Build | 8 +
> tools/objtool/arch/arm64/decode.c | 471 ++++++
> .../arch/arm64/include/arch/cfi_regs.h | 14 +
> tools/objtool/arch/arm64/include/arch/elf.h | 6 +
> .../arch/arm64/include/arch/endianness.h | 9 +
> .../objtool/arch/arm64/include/arch/special.h | 23 +
> tools/objtool/arch/arm64/special.c | 134 ++
> tools/objtool/arch/x86/decode.c | 5 +
> tools/objtool/check.c | 6 +
> tools/objtool/include/objtool/arch.h | 3 +
> tools/objtool/sync-check.sh | 5 +
> 23 files changed, 2832 insertions(+), 6 deletions(-)
> create mode 100644 scripts/gcc-plugins/arm64_switch_table_detection_plugin.c
> create mode 100644 tools/arch/arm64/include/asm/aarch64-insn.h
> create mode 100644 tools/arch/arm64/lib/aarch64-insn.c
> create mode 100644 tools/include/linux/printk.h
> create mode 100644 tools/objtool/arch/arm64/Build
> create mode 100644 tools/objtool/arch/arm64/decode.c
> create mode 100644 tools/objtool/arch/arm64/include/arch/cfi_regs.h
> create mode 100644 tools/objtool/arch/arm64/include/arch/elf.h
> create mode 100644 tools/objtool/arch/arm64/include/arch/endianness.h
> create mode 100644 tools/objtool/arch/arm64/include/arch/special.h
> create mode 100644 tools/objtool/arch/arm64/special.c
>
> --
> 2.25.4
>

2021-01-21 10:32:00

by Julien Thierry

[permalink] [raw]
Subject: Re: [RFC PATCH 00/17] objtool: add base support for arm64

Hi Ard,

On 1/21/21 10:03 AM, Ard Biesheuvel wrote:
> Hello Julien,
>
> On Wed, 20 Jan 2021 at 18:38, Julien Thierry <[email protected]> wrote:
>>
>> Hi,
>>
>> This series enables objtool to start doing stack validation on arm64
>> kernel builds.
>
> Could we elaborate on this point, please? 'Stack validation' means
> getting an accurate picture of all kernel code that will be executed
> at some point in the future, due to the fact that there are stack
> frames pointing to them. And this ability is essential in order to do
> live patching safely?
>
> If this is the goal, I wonder whether this is the right approach for
> arm64 (or for any other architecture, for that matter)
>
> Parsing/decoding the object code and even worse, relying on GCC
> plugins to annotate some of the idioms as they are being generated, in
> order to infer intent on the part of the compiler goes *way* beyond
> what we should be comfortable with. The whole point of this exercise
> is to guarantee that there are no false positives when it comes to
> deciding whether the kernel is in a live patchable state, and I don't
> see how we can ever provide such a guarantee when it is built on such
> a fragile foundation.
>
> If we want to ensure that the stack contents are always an accurate
> reflection of the real call stack, we should work with the toolchain
> folks to identify issues that may interfere with this, and implement
> controls over these behaviors that we can decide to use in the build.
> In the past, I have already proposed adding a 'kernel' code model to
> the AArch64 compiler that guarantees certain things, such as adrp/add
> for symbol references, and no GOT indirections for position
> independent code. Inhibiting optimizations that may impact our ability
> to infer the real call stack from the stack contents is something we
> might add here as well.
>

I'm not familiar with toolcahin code models, but would this approach be
able to validate assembly code (either inline or in assembly files?)

> Another thing that occurred to me is that inferring which kernel code
> is actually live in terms of pending function returns could be
> inferred much more easily from a shadow call stack, which is a thing
> we already implement for Clang builds.
>

I was not familiar with the shadow call stack. If I understand correctly
that would be a stack of return addresses of function currently on the
call stack, is that correct?

That would indeed be a simpler approach, however I guess the
instrumentation has a cost. Is the instrumentation also available with
GCC? And is this instrumentation efficient enough to be suitable for
production builds?

If we can rely on shadow call stack to implement the reliable unwinder,
I guess this could be the way to go.

> In summary, I would not be in favor of enabling objtool on arm64 at
> all until we have exhausted other options for providing the
> functionality that we need it for (given that objtool provides many
> other things that only x86 cares about, IIUC)
>
I understand the concern and appreciate the suggestion. I guess this
does need some thorough discussions for the right approach.

Thanks,

--
Julien Thierry

2021-01-21 11:13:28

by Ard Biesheuvel

[permalink] [raw]
Subject: Re: [RFC PATCH 00/17] objtool: add base support for arm64

On Thu, 21 Jan 2021 at 11:26, Julien Thierry <[email protected]> wrote:
>
> Hi Ard,
>
> On 1/21/21 10:03 AM, Ard Biesheuvel wrote:
> > Hello Julien,
> >
> > On Wed, 20 Jan 2021 at 18:38, Julien Thierry <[email protected]> wrote:
> >>
> >> Hi,
> >>
> >> This series enables objtool to start doing stack validation on arm64
> >> kernel builds.
> >
> > Could we elaborate on this point, please? 'Stack validation' means
> > getting an accurate picture of all kernel code that will be executed
> > at some point in the future, due to the fact that there are stack
> > frames pointing to them. And this ability is essential in order to do
> > live patching safely?
> >
> > If this is the goal, I wonder whether this is the right approach for
> > arm64 (or for any other architecture, for that matter)
> >
> > Parsing/decoding the object code and even worse, relying on GCC
> > plugins to annotate some of the idioms as they are being generated, in
> > order to infer intent on the part of the compiler goes *way* beyond
> > what we should be comfortable with. The whole point of this exercise
> > is to guarantee that there are no false positives when it comes to
> > deciding whether the kernel is in a live patchable state, and I don't
> > see how we can ever provide such a guarantee when it is built on such
> > a fragile foundation.
> >
> > If we want to ensure that the stack contents are always an accurate
> > reflection of the real call stack, we should work with the toolchain
> > folks to identify issues that may interfere with this, and implement
> > controls over these behaviors that we can decide to use in the build.
> > In the past, I have already proposed adding a 'kernel' code model to
> > the AArch64 compiler that guarantees certain things, such as adrp/add
> > for symbol references, and no GOT indirections for position
> > independent code. Inhibiting optimizations that may impact our ability
> > to infer the real call stack from the stack contents is something we
> > might add here as well.
> >
>
> I'm not familiar with toolcahin code models, but would this approach be
> able to validate assembly code (either inline or in assembly files?)
>

No, it would not. But those files are part of the code base, and can
be reviewed and audited.

> > Another thing that occurred to me is that inferring which kernel code
> > is actually live in terms of pending function returns could be
> > inferred much more easily from a shadow call stack, which is a thing
> > we already implement for Clang builds.
> >
>
> I was not familiar with the shadow call stack. If I understand correctly
> that would be a stack of return addresses of function currently on the
> call stack, is that correct?
>
> That would indeed be a simpler approach, however I guess the
> instrumentation has a cost. Is the instrumentation also available with
> GCC? And is this instrumentation efficient enough to be suitable for
> production builds?
>

I am not aware of any plans to enable this in GCC, but the Clang
implementation is definitely intended for production use (it's a CFI
feature for ROP/JOP mitigation)

> If we can rely on shadow call stack to implement the reliable unwinder,
> I guess this could be the way to go.
>
> > In summary, I would not be in favor of enabling objtool on arm64 at
> > all until we have exhausted other options for providing the
> > functionality that we need it for (given that objtool provides many
> > other things that only x86 cares about, IIUC)
> >
> I understand the concern and appreciate the suggestion. I guess this
> does need some thorough discussions for the right approach.
>

Agreed.

2021-01-21 11:41:01

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [RFC PATCH 00/17] objtool: add base support for arm64

On Thu, Jan 21, 2021 at 12:08:23PM +0100, Ard Biesheuvel wrote:
> On Thu, 21 Jan 2021 at 11:26, Julien Thierry <[email protected]> wrote:

> > I'm not familiar with toolcahin code models, but would this approach be
> > able to validate assembly code (either inline or in assembly files?)
> >
>
> No, it would not. But those files are part of the code base, and can
> be reviewed and audited.

x86 has a long history if failing at exactly that.

2021-01-21 11:52:00

by Ard Biesheuvel

[permalink] [raw]
Subject: Re: [RFC PATCH 00/17] objtool: add base support for arm64

On Thu, 21 Jan 2021 at 12:23, Peter Zijlstra <[email protected]> wrote:
>
> On Thu, Jan 21, 2021 at 12:08:23PM +0100, Ard Biesheuvel wrote:
> > On Thu, 21 Jan 2021 at 11:26, Julien Thierry <[email protected]> wrote:
>
> > > I'm not familiar with toolcahin code models, but would this approach be
> > > able to validate assembly code (either inline or in assembly files?)
> > >
> >
> > No, it would not. But those files are part of the code base, and can
> > be reviewed and audited.
>
> x86 has a long history if failing at exactly that.

That's a fair point. But on the flip side, maintaining objtool does
not look like it has been a walk in the park either.

What i am especially concerned about is things like 3193c0836f20,
where we actually have to disable certain compiler optimizations
because they interfere with objtool's ability to understand the
resulting object code. Correctness and performance are challenging
enough as requirements for generated code.

Mind you, I am not saying it is not worth it *for x86*, where there is
a lot of other stuff going on. But on arm64, we don't care about ORC,
about -fomit-frame-pointer, about retpolines or about any of the other
things objtool enables.

On arm64, all it currently seems to provide is a way to capture the
call stack accurately, and given that it needs a GCC plugin for this
(which needs to be maintained as well, which is non-trivial, and also
bars us from using objtool with Clang builds), my current position is
simply that opening this can of worms at this point is just not worth
it.

2021-01-21 13:29:52

by Julien Thierry

[permalink] [raw]
Subject: Re: [RFC PATCH 00/17] objtool: add base support for arm64



On 1/21/21 12:08 PM, Ard Biesheuvel wrote:
> On Thu, 21 Jan 2021 at 11:26, Julien Thierry <[email protected]> wrote:
>>
>> Hi Ard,
>>
>> On 1/21/21 10:03 AM, Ard Biesheuvel wrote:
>>> Hello Julien,
>>>
>>> On Wed, 20 Jan 2021 at 18:38, Julien Thierry <[email protected]> wrote:
>>>>
>>>> Hi,
>>>>
>>>> This series enables objtool to start doing stack validation on arm64
>>>> kernel builds.
>>>
>>> Could we elaborate on this point, please? 'Stack validation' means
>>> getting an accurate picture of all kernel code that will be executed
>>> at some point in the future, due to the fact that there are stack
>>> frames pointing to them. And this ability is essential in order to do
>>> live patching safely?
>>>
>>> If this is the goal, I wonder whether this is the right approach for
>>> arm64 (or for any other architecture, for that matter)
>>>
>>> Parsing/decoding the object code and even worse, relying on GCC
>>> plugins to annotate some of the idioms as they are being generated, in
>>> order to infer intent on the part of the compiler goes *way* beyond
>>> what we should be comfortable with. The whole point of this exercise
>>> is to guarantee that there are no false positives when it comes to
>>> deciding whether the kernel is in a live patchable state, and I don't
>>> see how we can ever provide such a guarantee when it is built on such
>>> a fragile foundation.
>>>
>>> If we want to ensure that the stack contents are always an accurate
>>> reflection of the real call stack, we should work with the toolchain
>>> folks to identify issues that may interfere with this, and implement
>>> controls over these behaviors that we can decide to use in the build.
>>> In the past, I have already proposed adding a 'kernel' code model to
>>> the AArch64 compiler that guarantees certain things, such as adrp/add
>>> for symbol references, and no GOT indirections for position
>>> independent code. Inhibiting optimizations that may impact our ability
>>> to infer the real call stack from the stack contents is something we
>>> might add here as well.
>>>
>>
>> I'm not familiar with toolcahin code models, but would this approach be
>> able to validate assembly code (either inline or in assembly files?)
>>
>
> No, it would not. But those files are part of the code base, and can
> be reviewed and audited.
>

That means that every actor maintaining their own stable version of the
kernel have to do their own audit when they do backports (assuming the
audit would be done for upstream) to be able to provide a safe
livepatching feature in their kernel.

>>> Another thing that occurred to me is that inferring which kernel code
>>> is actually live in terms of pending function returns could be
>>> inferred much more easily from a shadow call stack, which is a thing
>>> we already implement for Clang builds.
>>>
>>
>> I was not familiar with the shadow call stack. If I understand correctly
>> that would be a stack of return addresses of function currently on the
>> call stack, is that correct?
>>
>> That would indeed be a simpler approach, however I guess the
>> instrumentation has a cost. Is the instrumentation also available with
>> GCC? And is this instrumentation efficient enough to be suitable for
>> production builds?
>>
>
> I am not aware of any plans to enable this in GCC, but the Clang
> implementation is definitely intended for production use (it's a CFI
> feature for ROP/JOP mitigation)
>

I think most people interested in livepatching are using GCC built
kernels, but I could be mistaken (althought in the long run, both
compilers should be supported, and yes, I realize the objtool solution
currently only would support GCC).

I don't know how feasible it will be to get it into GCC if people decide
to go with that. Also, now that I think about it, it will probably come
with similar limitations as stackframes where the unwinder would need to
know when/where the shadow call stack is unavailable for some reason and
the stack trace is not reliable. (it might be a bit simpler to audit
than stack frame setting and maybe have less limitations, but I guess
there will still be cases where we can't rely on it)

--
Julien Thierry

2021-01-21 14:29:14

by Mark Brown

[permalink] [raw]
Subject: Re: [RFC PATCH 00/17] objtool: add base support for arm64

On Thu, Jan 21, 2021 at 02:23:53PM +0100, Julien Thierry wrote:
> On 1/21/21 12:08 PM, Ard Biesheuvel wrote:

> > I am not aware of any plans to enable this in GCC, but the Clang
> > implementation is definitely intended for production use (it's a CFI
> > feature for ROP/JOP mitigation)

> I think most people interested in livepatching are using GCC built kernels,
> but I could be mistaken (althought in the long run, both compilers should be
> supported, and yes, I realize the objtool solution currently only would
> support GCC).

There definitely seem to be some users interested in both livepatch and
clang built kernels so it might come up relatively quickly.


Attachments:
(No filename) (683.00 B)
signature.asc (499.00 B)
Download all attachments

2021-01-21 19:31:55

by Josh Poimboeuf

[permalink] [raw]
Subject: Re: [RFC PATCH 00/17] objtool: add base support for arm64

Ard,

Sorry, I was late to the party, attempting to reply to the entire thread
at once. Also, adding the live-patching ML.

I agree with a lot of your concerns. Reverse engineering the control
flow of the compiled binary is kind of ridiculous. I was always
surprised that it works. I still am! But I think it's more robust than
you give it credit for.

Most of the existing code just works, with (annual) tweaks for new
compiler versions. In fact now it works well with both GCC and Clang,
across several versions. Soon it will work with LTO.

It has grown many uses beyond stack validation: ORC, static calls,
retpolines validation, noinstr validation, SMAP validation. It has
found a *lot* of compiler bugs. And there will probably be more use
cases when we get vmlinux validation running fast enough.

But there is indeed a maintenance burden. I often ask myself if it's
worth it. So far the answer has been yes :-) Particularly because it
has influenced many positive changes to the kernel. And it helps now
that even more people are contributing and adding useful features.

But you should definitely think twice before letting it loose on your
arch, especially if you have some other way to ensure reliable stack
metadata, and if you don't have a need for the other objtool features.


Regarding your other proposals:

1) I'm doubtful we can ever rely on the toolchain to ensure reliable
unwind metadata, because:

a) most of the problems are in asm and inline-asm; good luck getting
the toolchain to care about those.

b) the toolchain is fragile; do we want to entrust the integrity of
live patching to the compiler's frame pointer generation (or other
unwind metadata) without having some sort of checking mechanism?


2) The shadow stack idea sounds promising -- how hard would it be to
make a prototype reliable unwinder?


More comments below:


On Thu, Jan 21, 2021 at 12:48:43PM +0100, Ard Biesheuvel wrote:
> On Thu, 21 Jan 2021 at 12:23, Peter Zijlstra <[email protected]> wrote:
> >
> > On Thu, Jan 21, 2021 at 12:08:23PM +0100, Ard Biesheuvel wrote:
> > > On Thu, 21 Jan 2021 at 11:26, Julien Thierry <[email protected]> wrote:
> >
> > > > I'm not familiar with toolcahin code models, but would this approach be
> > > > able to validate assembly code (either inline or in assembly files?)
> > > >
> > >
> > > No, it would not. But those files are part of the code base, and can
> > > be reviewed and audited.
> >
> > x86 has a long history if failing at exactly that.
>
> That's a fair point. But on the flip side, maintaining objtool does
> not look like it has been a walk in the park either.

I think you missed Peter's point: it's not that it's *hard* for humans
to continuously review/audit all asm and inline-asm; it's just not
feasible to do it 100% correctly, 100% of the time.

Like any other code, objtool requires maintenance, but its analysis is
orders of magnitude more robust than any human.

> What i am especially concerned about is things like 3193c0836f20,
> where we actually have to disable certain compiler optimizations
> because they interfere with objtool's ability to understand the
> resulting object code. Correctness and performance are challenging
> enough as requirements for generated code.

Well, you managed to find the worst case scenario. I think that's the
only time we ever had to do that. Please don't think that's normal, or
even a generally realistic concern. Objtool tries really hard to stay
out of the way.

Long term we really want to prevent that type of thing with the help of
annotations from compiler plugins, similar to what Julien did here.

Yes, it would mean two objtool compiler plugins (GCC and Clang), but it
would ease the objtool maintenance burden and risk in many ways. And
prevent situations like that commit you found. It may sound fragile,
but it will actually make things simpler overall: less reverse
engineering of GCC switch jump tables and __noreturn functions is a good
thing.

> Mind you, I am not saying it is not worth it *for x86*, where there is
> a lot of other stuff going on. But on arm64, we don't care about ORC,
> about -fomit-frame-pointer, about retpolines or about any of the other
> things objtool enables.
>
> On arm64, all it currently seems to provide is a way to capture the
> call stack accurately, and given that it needs a GCC plugin for this
> (which needs to be maintained as well, which is non-trivial, and also
> bars us from using objtool with Clang builds), my current position is
> simply that opening this can of worms at this point is just not worth
> it.

As far as GCC plugins go, it looks pretty basic to me. Also this
doesn't *at all* prevent Clang from being used for live patching. If
anybody actually tries running Julien's patches on a Clang-built kernel,
it might just work. But if not, and the switch tables turn out to be
unparseable like on GCC, we could have a Clang plugin. As I mentioned,
we'll probably end up having one anyway for x86.

--
Josh

2021-01-22 17:51:40

by Mark Brown

[permalink] [raw]
Subject: Re: [RFC PATCH 00/17] objtool: add base support for arm64

On Thu, Jan 21, 2021 at 12:54:52PM -0600, Josh Poimboeuf wrote:

> 2) The shadow stack idea sounds promising -- how hard would it be to
> make a prototype reliable unwinder?

In theory it doesn't look too hard and I can't see a particular reason
not to try doing this - there's going to be edge cases but hopefully for
reliable stack trace they're all in areas where we would be happy to
just decide the stack isn't reliable anyway, things like nesting which
allocates separate shadow stacks for each nested level for example.
I'll take a look.


Attachments:
(No filename) (559.00 B)
signature.asc (499.00 B)
Download all attachments

2021-01-22 18:06:03

by Ard Biesheuvel

[permalink] [raw]
Subject: Re: [RFC PATCH 00/17] objtool: add base support for arm64

On Fri, 22 Jan 2021 at 18:44, Mark Brown <[email protected]> wrote:
>
> On Thu, Jan 21, 2021 at 12:54:52PM -0600, Josh Poimboeuf wrote:
>
> > 2) The shadow stack idea sounds promising -- how hard would it be to
> > make a prototype reliable unwinder?
>
> In theory it doesn't look too hard and I can't see a particular reason
> not to try doing this - there's going to be edge cases but hopefully for
> reliable stack trace they're all in areas where we would be happy to
> just decide the stack isn't reliable anyway, things like nesting which
> allocates separate shadow stacks for each nested level for example.
> I'll take a look.

This reminds me - a while ago, I had a stab at writing a rudimentary
GCC plugin that pushes/pops return addresses to a shadow call stack
pointed to by x18 [0]
I am by no means suggesting that we should rely on a GCC plugin for
this, only that it does seem rather straight-forward for the compiler
to manage a stack with return addresses like that (although the devil
is probably in the details, as usual)

[0] https://git.kernel.org/pub/scm/linux/kernel/git/ardb/linux.git/log/?h=arm64-scs-gcc

2021-01-22 21:20:45

by Madhavan T. Venkataraman

[permalink] [raw]
Subject: Re: [RFC PATCH 00/17] objtool: add base support for arm64



On 1/22/21 11:43 AM, Mark Brown wrote:
> On Thu, Jan 21, 2021 at 12:54:52PM -0600, Josh Poimboeuf wrote:
>
>> 2) The shadow stack idea sounds promising -- how hard would it be to
>> make a prototype reliable unwinder?
>
> In theory it doesn't look too hard and I can't see a particular reason
> not to try doing this - there's going to be edge cases but hopefully for
> reliable stack trace they're all in areas where we would be happy to
> just decide the stack isn't reliable anyway, things like nesting which
> allocates separate shadow stacks for each nested level for example.
> I'll take a look.
>

I am a new comer to this discussion and I am learning. Just have some
questions. Pardon me if they are obvious or if they have already been
asked and answered.

Doesn't Clang already have support for a shadow stack implementation for ARM64?
We could take a look at how Clang does it.

Will there not be a significant performance hit? May be, some of it can be
mitigated by using a parallel shadow stack rather than a compact one.

Are there any longjmp style situations in the kernel where the stack is
unwound by several frames? In these cases, the shadow stack must be unwound
accordingly.

Madhavan

>
> _______________________________________________
> linux-arm-kernel mailing list
> [email protected]
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
>

2021-01-22 21:22:01

by Madhavan T. Venkataraman

[permalink] [raw]
Subject: Re: [RFC PATCH 00/17] objtool: add base support for arm64



On 1/22/21 11:43 AM, Mark Brown wrote:
> On Thu, Jan 21, 2021 at 12:54:52PM -0600, Josh Poimboeuf wrote:
>
>> 2) The shadow stack idea sounds promising -- how hard would it be to
>> make a prototype reliable unwinder?
>
> In theory it doesn't look too hard and I can't see a particular reason
> not to try doing this - there's going to be edge cases but hopefully for
> reliable stack trace they're all in areas where we would be happy to
> just decide the stack isn't reliable anyway, things like nesting which
> allocates separate shadow stacks for each nested level for example.
> I'll take a look.
>

I am a new comer to this discussion and I am learning. Just have some
questions. Pardon me if they are obvious or if they have already been
asked and answered.

Doesn't Clang already have support for a shadow stack implementation for ARM64?
We could take a look at how Clang does it.

Will there not be a significant performance hit? May be, some of it can be
mitigated by using a parallel shadow stack rather than a compact one.

Are there any longjmp style situations in the kernel where the stack is
unwound by several frames? In these cases, the shadow stack must be unwound
accordingly.

Madhavan

>
> _______________________________________________
> linux-arm-kernel mailing list
> [email protected]
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
>

2021-01-22 21:45:59

by Ard Biesheuvel

[permalink] [raw]
Subject: Re: [RFC PATCH 00/17] objtool: add base support for arm64

On Fri, 22 Jan 2021 at 22:15, Madhavan T. Venkataraman
<[email protected]> wrote:
>
>
>
> On 1/22/21 11:43 AM, Mark Brown wrote:
> > On Thu, Jan 21, 2021 at 12:54:52PM -0600, Josh Poimboeuf wrote:
> >
> >> 2) The shadow stack idea sounds promising -- how hard would it be to
> >> make a prototype reliable unwinder?
> >
> > In theory it doesn't look too hard and I can't see a particular reason
> > not to try doing this - there's going to be edge cases but hopefully for
> > reliable stack trace they're all in areas where we would be happy to
> > just decide the stack isn't reliable anyway, things like nesting which
> > allocates separate shadow stacks for each nested level for example.
> > I'll take a look.
> >
>
> I am a new comer to this discussion and I am learning. Just have some
> questions. Pardon me if they are obvious or if they have already been
> asked and answered.
>
> Doesn't Clang already have support for a shadow stack implementation for ARM64?
> We could take a look at how Clang does it.
>
> Will there not be a significant performance hit? May be, some of it can be
> mitigated by using a parallel shadow stack rather than a compact one.
>
> Are there any longjmp style situations in the kernel where the stack is
> unwound by several frames? In these cases, the shadow stack must be unwound
> accordingly.
>

Hello Madhavan,

Let's discuss the details of shadow call stacks on a separate thread,
instead of further hijacking Julien's series.

2021-01-22 21:47:07

by Madhavan T. Venkataraman

[permalink] [raw]
Subject: Re: [RFC PATCH 00/17] objtool: add base support for arm64



On 1/22/21 3:43 PM, Ard Biesheuvel wrote:
> On Fri, 22 Jan 2021 at 22:15, Madhavan T. Venkataraman
> <[email protected]> wrote:
>>
>>
>>
>> On 1/22/21 11:43 AM, Mark Brown wrote:
>>> On Thu, Jan 21, 2021 at 12:54:52PM -0600, Josh Poimboeuf wrote:
>>>
>>>> 2) The shadow stack idea sounds promising -- how hard would it be to
>>>> make a prototype reliable unwinder?
>>>
>>> In theory it doesn't look too hard and I can't see a particular reason
>>> not to try doing this - there's going to be edge cases but hopefully for
>>> reliable stack trace they're all in areas where we would be happy to
>>> just decide the stack isn't reliable anyway, things like nesting which
>>> allocates separate shadow stacks for each nested level for example.
>>> I'll take a look.
>>>
>>
>> I am a new comer to this discussion and I am learning. Just have some
>> questions. Pardon me if they are obvious or if they have already been
>> asked and answered.
>>
>> Doesn't Clang already have support for a shadow stack implementation for ARM64?
>> We could take a look at how Clang does it.
>>
>> Will there not be a significant performance hit? May be, some of it can be
>> mitigated by using a parallel shadow stack rather than a compact one.
>>
>> Are there any longjmp style situations in the kernel where the stack is
>> unwound by several frames? In these cases, the shadow stack must be unwound
>> accordingly.
>>
>
> Hello Madhavan,
>
> Let's discuss the details of shadow call stacks on a separate thread,
> instead of further hijacking Julien's series.
>

OK. Sounds good.

2021-01-25 21:23:57

by Josh Poimboeuf

[permalink] [raw]
Subject: Re: [RFC PATCH 00/17] objtool: add base support for arm64

On Fri, Jan 22, 2021 at 10:43:09PM +0100, Ard Biesheuvel wrote:
> On Fri, 22 Jan 2021 at 22:15, Madhavan T. Venkataraman <[email protected]> wrote:
> > On 1/22/21 11:43 AM, Mark Brown wrote:
> > > On Thu, Jan 21, 2021 at 12:54:52PM -0600, Josh Poimboeuf wrote:
> > >
> > >> 2) The shadow stack idea sounds promising -- how hard would it be to
> > >> make a prototype reliable unwinder?
> > >
> > > In theory it doesn't look too hard and I can't see a particular reason
> > > not to try doing this - there's going to be edge cases but hopefully for
> > > reliable stack trace they're all in areas where we would be happy to
> > > just decide the stack isn't reliable anyway, things like nesting which
> > > allocates separate shadow stacks for each nested level for example.
> > > I'll take a look.
> > >
> >
> > I am a new comer to this discussion and I am learning. Just have some
> > questions. Pardon me if they are obvious or if they have already been
> > asked and answered.
> >
> > Doesn't Clang already have support for a shadow stack implementation for ARM64?
> > We could take a look at how Clang does it.
> >
> > Will there not be a significant performance hit? May be, some of it can be
> > mitigated by using a parallel shadow stack rather than a compact one.
> >
> > Are there any longjmp style situations in the kernel where the stack is
> > unwound by several frames? In these cases, the shadow stack must be unwound
> > accordingly.
> >
>
> Hello Madhavan,
>
> Let's discuss the details of shadow call stacks on a separate thread,
> instead of further hijacking Julien's series.

It's quite relevant to this thread. There's no need to consider merging
Julien's patches if you have a better approach. Why not discuss it
here? I'm also interested in the answers to Madhavan's questions.

--
Josh

2021-01-28 22:13:04

by Madhavan T. Venkataraman

[permalink] [raw]
Subject: Re: [RFC PATCH 00/17] objtool: add base support for arm64

Hi,

I sent this suggestion to linux-arm-kernel in response to the Reliable Stacktrace RFC from Mark Brown
and Mark Rutland. I am repeating it here for two reasons:

- It involves objtool.

- There are many more recipients in this thread that may be interested in this topic.

Please let me know if this suggestion is acceptable. If it is not, please let me know why.
Thanks.

Also, I apologize to all of you who have received this more than once.

FP and no-FP functions
=====================

I have a suggestion for objtool and the unwinder for ARM64.

IIUC, objtool is responsible for walking all the code paths (except unreachable
and ignored ones) and making sure that every function has proper frame pointer
code (prolog, epilog, etc). If a function is found to not have it, the kernel
build is failed. Is this understanding correct?

If so, can we take a different approach for ARM64?

Instead of failing the kernel build, can we just mark the functions as:

FP Functions that have proper FP code
no-FP Functions that don't

May be, we can add an "FP" flag in the symbol table entry for this.

Then, the unwinder can check the functions it encounters in the stack trace and
inform the caller if it found any no-FP functions. The caller of the unwinder can
decide what he wants to do with that information.

- the caller can ignore it

- the caller can print the stack trace with a warning that no-FP functions
were found

- if the caller is livepatch, the caller can retry until the no-FP functions
disappear from the stack trace. This way, we can have live patching even
when some of the functions in the kernel are no-FP.

Does this make any sense? Is this acceptable? What are the pitfalls?

If we can do this, the unwinder could detect cases such as:

- If gcc thinks that a function is a leaf function but the function contains
inline assembly code that calls another function.

- If a call to a function bounces through some intermediate code such as a
trampoline.

- etc.

For specific no-FP functions, the unwinder might be able to deduce the original
caller. In these cases, the stack trace would still be reliable. For all the others,
the stack trace would be considered unreliable.

Compiler instead of objtool
===========================

If the above suggestion is acceptable, I have another suggestion.

It is a lot of work for every new architecture to add frame pointer verification
support in objtool. Can we get some help from the compiler?

The compiler knows which C functions it generates the FP prolog and epilog for. It can
mark those functions as FP. As for assembly functions, kernel developers could manually
annotate functions that have proper FP code. The compiler/assembler would mark them
as FP. Only a small subset of assembly functions would even have FP prolog and epilog.

Is this acceptable? What are the pitfalls?

This can be implemented easily for all architectures for which the compiler generates
FP code.

Can this be implemented using a GCC plugin? I know squat about GCC plugins.

Thanks!

Madhavan