2022-04-20 10:36:10

by Joao Moreira

[permalink] [raw]
Subject: [RFC PATCH 00/11] Kernel FineIBT Support

From: Joao Moreira <[email protected]>

Disclaimer: This is all in a very early/poc stage and is mostly research work
-- be advised to proceed with care and to bring a towel with you.

This patch series enables FineIBT in the kernel. FineIBT is a compiler-enhanced
forward-edge CFI scheme built on top of Intel's CET-IBT that works by setting a
hash on the caller side which is then checked at the callee side. Because IBT
requires indirect branches to land on ENDBR instructions, these hash checks
shouldn't be bypassable on the occasion of function pointer corruption. More
details on the general FineIBT design are here [1, 2].

When compared to IBT itself, FineIBT imposes a more restrictive policy that
should be more robust against control-flow hijacking attacks. When compared to
schemes like KCFI [3], it has the benefit of not depending on memory reads
(which not only might be more efficient in terms of performance and power but
also makes it compatible with XOM [4]) and brings in the benefits of IBT
regarding speculative execution hardening.

A debatable point is the fact that on FineIBT the checks are made on the callee
side. On a quick look, this seems to be cool because it allows strict
reachability refinement of more security-critical functions (like hardware
feature disabling ones) while still allowing other less critical functions to be
relaxed/coarse-grained; under caller-side checks, if one single function is
required to be relaxed, this leads into an indirect call instruction being
relaxed, then becoming a branch capable of reaching all the functions in the
executable address space, including those considered security-critical. Inputs
and opinions on this are very welcome, as there are other perspectives about
this I might be missing.

This series relies heavily on the recently upstreamed IBT support and also
respins some sorcery proposed by Peter Zijlstra in his IBT v2 series [5]. A huge
part of these is a repurpose of work originally cast by Peter.

The FineIBT enablement uses a modified clang version to emit code with the
FineIBT hash set/check operations. The modified clang is available here [6]. The
.config used for building and testing is available here [7] along with more or
less general instructions on how to build it. A tree with this series on top is
available here [8].

Key insights:
- On IBT v2, Peter proposed an optimization that uses objtool to add an offset
to operands of direct branches targeting ENDBR instructions, skipping the need
to fetch/decode these. With FineIBT, skipping ENDBRs+hash checks is not only
desirable but needed, as a way to prevent direct calls from being considered a
violation whenever they reach a function without priorly setting the respective
hash. This series respins the approach and uses objtool to fix direct branches
targeting FineIBT hash check operations. Fixing this in objtool instead of using
the compiler is convenient because then it becomes easy to mix
FineIBT-instrumented code with binaries only augmented with regular ENDBRs (such
as currently-existing assembly).
- The same approach (identifying FineIBT hash check sequences and fixing direct
branch targets) is also used dynamically to support module loading (fix the
relocations), text-patching (support static calls), and on BPF (support jitted
calls to core functions).
- When a direct branch into a FineIBT hash check cannot be fixed (because it is
a short jump targeting an instruction which, once incremented with the needed
offset, becomes unreachable) the respective functionality patches the FineIBT
sequence with nops, making it a valid target that is still constrained by IBT.
- This series also fixes unmatching prototypes of certain indirect calls that
will trigger violations on any prototype-based CFI scheme.
- Also adds test modules for both IBT and FineIBT.
- Also adds coarsecf_check attributes to specific functions, making the compiler
emit them with regular ENDBRs instead of the FineIBT hash check. This is
useful because certain functions are called from assembly and we currently don't
have a sane scheme to set hashes in all of these (although we do it in one more
relevant spot).
- In the occasion of violations, the hash check invokes a __fineibt_handler,
which is a function that makes it easier to debug unmatching prototypes and
such. It can be easily switched to an ud2 instruction or anything like that.
- In my tests, the above-mentioned .config runs this series smoothly, without
any false-positive violations.

Some obvious possible improvements:
- The support should identify FineIBT sequences based on annotations, not on
parsing the actual instructions. This would make things less hacky and more
reliable.
- Assembly coverage must be improved eventually.
- The FineIBT hash check operation can have its length reduced by replacing the
inlined check with a call to a checker.

@PeterZ @JoshP

I'm a bit unaware of the details on why the objtool approach to bypass ENDBRs
was removed from the IBT series. Is this approach now sensible considering that
it is a requirement for a new/enhanced feature? If not, how extending the Linker
to emit already fixed offsets sounds like?

@Kees

I'm considering detaching the prototype fixes from this series and reworking
them to submit actual fixes (patches 10 and 11). Any specific suggestions for
these specific patches? Maybe you want to take a look and help in co-authorship
as we did with the void*-in-x86-crypto patches in the past? I guess these are
useful for whatever CFI scheme is in place.

@all

Any other major concerns, ideas, or suggestions? :)

Refs:

[1] - FineIBT:
https://www.openwall.com/lists/kernel-hardening/2021/02/11/1

[2] - FineIBT on Linux Security Summit:
https://static.sched.com/hosted_files/lssna2021/8f/LSS_FINEIBT_JOAOMOREIRA.pdf

[3] - KCFI Clang Patches:
https://reviews.llvm.org/D119296/new/

[4] - eXecute-Only Memory:
https://lpc.events/event/4/contributions/283/attachments/357/588/Touch_but_dont_look__Running_the_kernel_in_execute_only_memory-presented.pdf

[5] - IBT Patches v2:
https://lore.kernel.org/lkml/[email protected]/

[6] - FineIBT-capable Clang:
https://github.com/lvwr/llvm-project/tree/fineibt/kernel

[7] - Kernel .config and dummy stuff:
https://github.com/lvwr/kfineibt_testing

[8] - Linux + FineIBT:
https://github.com/lvwr/linux/tree/x86/fineibt

Joao Moreira (11):
x86: kernel FineIBT
kbuild: Support FineIBT build
objtool: Support FineIBT offset fixes
x86/module: Support FineIBT in modules
x86/text-patching: Support FineIBT text-patching
x86/bpf: Support FineIBT
x86/lib: Prevent UACCESS call warning from objtool
x86/ibt: Add CET_TEST module for IBT testing
x86/FineIBT: Add FINEIBT_TEST module
linux/interrupt: Fix prototype matching property
driver/int3400_thermal: Fix prototype matching

arch/x86/Kconfig | 10 +
arch/x86/Kconfig.debug | 10 +
arch/x86/Makefile | 3 +
arch/x86/entry/entry_64.S | 1 +
arch/x86/entry/vdso/Makefile | 5 +
arch/x86/include/asm/ibt.h | 16 ++
arch/x86/include/asm/setup.h | 12 +-
arch/x86/include/asm/special_insns.h | 4 +-
arch/x86/include/asm/text-patching.h | 92 ++++++++-
arch/x86/kernel/Makefile | 3 +
arch/x86/kernel/cet_test.c | 30 +++
arch/x86/kernel/cpu/common.c | 2 +-
arch/x86/kernel/fineibt.c | 123 ++++++++++++
arch/x86/kernel/fineibt_test.c | 39 ++++
arch/x86/kernel/head64.c | 12 +-
arch/x86/kernel/module.c | 45 ++++-
arch/x86/kernel/smpboot.c | 2 +-
arch/x86/lib/copy_mc.c | 2 +-
arch/x86/net/bpf_jit_comp.c | 31 +++
arch/x86/purgatory/Makefile | 4 +
.../intel/int340x_thermal/int3400_thermal.c | 10 +-
include/linux/interrupt.h | 6 +-
scripts/Makefile.build | 1 +
scripts/link-vmlinux.sh | 8 +
tools/objtool/arch/x86/decode.c | 175 +++++++++++++----
tools/objtool/arch/x86/include/arch/special.h | 2 +
tools/objtool/builtin-check.c | 3 +-
tools/objtool/check.c | 183 +++++++++++++++++-
tools/objtool/include/objtool/arch.h | 3 +
tools/objtool/include/objtool/builtin.h | 2 +-
30 files changed, 767 insertions(+), 72 deletions(-)
create mode 100644 arch/x86/kernel/cet_test.c
create mode 100644 arch/x86/kernel/fineibt.c
create mode 100644 arch/x86/kernel/fineibt_test.c

--
2.35.1


2022-04-20 14:07:19

by Joao Moreira

[permalink] [raw]
Subject: [RFC PATCH 05/11] x86/text-patching: Support FineIBT text-patching

From: Joao Moreira <[email protected]>

When patching a direct branch into text, consider that the target may have
a FineIBT hash check sequence and then sum the respective offset to the
branch target address if this is the case. This is needed to support
static calls.

Signed-off-by: Joao Moreira <[email protected]>
Tinkered-from-patches-by: Peter Zijlstra <[email protected]>
---
arch/x86/include/asm/text-patching.h | 92 +++++++++++++++++++++++++++-
1 file changed, 91 insertions(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/text-patching.h b/arch/x86/include/asm/text-patching.h
index d20ab0921480..a450761fae62 100644
--- a/arch/x86/include/asm/text-patching.h
+++ b/arch/x86/include/asm/text-patching.h
@@ -4,6 +4,7 @@

#include <linux/types.h>
#include <linux/stddef.h>
+#include <linux/uaccess.h>
#include <asm/ptrace.h>

struct paravirt_patch_site;
@@ -66,6 +67,12 @@ extern void text_poke_finish(void);
#define JMP8_INSN_SIZE 2
#define JMP8_INSN_OPCODE 0xEB

+#define SUB_INSN_SIZE 7
+#define SUB_INSN_OPCODE 0x41
+
+#define JE_INSN_SIZE 2
+#define JE_INSN_OPCODE 0x74
+
#define DISP32_SIZE 4

static __always_inline int text_opcode_size(u8 opcode)
@@ -96,6 +103,83 @@ union text_poke_insn {
} __attribute__((packed));
};

+#ifdef CONFIG_X86_KERNEL_FINEIBT
+#define FINEIBT_FIXUP 18
+// AFTER_FINEIBT = FINEIBT_FIXUP - ENDBR_LEN - XOR_LEN - JMP LEN
+#define AFTER_FINEIBT FINEIBT_FIXUP - ENDBR_INSN_SIZE - 3 - 2
+
+/// XXX: THIS IS *NOT* PROPERLY TESTED!
+/// I did stumble on any scenario where this was needed while testing FineIBT,
+/// Yet, I'm keeping this here for concept/future reference. - If we can't fix
+/// the displacement, then the branch will always stumble on the FineIBT hash
+/// check. To prevent that, patch the FineIBT hash check with nops.
+static __always_inline
+void bypass_fineibt_sequence(void *insn) {
+ static const char code[14] = { 0x4d, 0x31, 0xdb, 0xeb, AFTER_FINEIBT,
+ BYTES_NOP8, BYTES_NOP1 };
+ if (unlikely(system_state == SYSTEM_BOOTING)) {
+ text_poke_early(insn + 4, code, 14);
+ text_poke_early(insn + 11, code, 14);
+ }
+
+ text_poke_bp(insn + 4, code, 14, NULL);
+ text_poke_bp(insn + 11, code, 14, NULL);
+}
+
+// Identify if the target address is a FineIBT instruction sequence, which
+// should be:
+// endbr
+// sub $hash, %r11d
+// je 1f
+// call fineibt_handler (this will eventually be replaced with ud2)
+// 1:
+static __always_inline
+bool __is_fineibt_sequence(const void *addr) {
+ union text_poke_insn text;
+ u32 insn;
+
+ // the sequence starts with an endbr
+ if (get_kernel_nofault(insn, addr) || !(is_endbr(insn)))
+ return false;
+
+ // then followed by a sub
+ if (get_kernel_nofault(text, addr+4) || text.opcode != SUB_INSN_OPCODE)
+ return false;
+
+ // followed by a je
+ if (get_kernel_nofault(text, addr+11) || text.opcode != JE_INSN_OPCODE)
+ return false;
+
+ // and finished with a call (which eventually will be an ud2)
+ if (get_kernel_nofault(text, addr+13) ||
+ text.opcode != CALL_INSN_OPCODE)
+ return false;
+
+ return true;
+}
+
+// Verify if the branch target is a FineIBT sequence. If yes, fix the target
+// to point right after the sequence, preventing crashes.
+static __always_inline
+void *__text_fix_fineibt_branch_target(const void *addr, void *dest, int size) {
+ bool fineibt;
+ s32 disp;
+ fineibt = __is_fineibt_sequence(dest);
+ if (!fineibt)
+ return dest;
+
+ disp = (long) dest - (long) (addr + size) + FINEIBT_FIXUP;
+
+ // if fineibt-fixed displacement doesn't fit as an operand,
+ // remove fineibt hash check from target.
+ if (size == 2 && ((disp >> 31) != (disp >> 7))) {
+ bypass_fineibt_sequence(dest);
+ return dest;
+ }
+ return dest + FINEIBT_FIXUP;
+}
+#endif
+
static __always_inline
void __text_gen_insn(void *buf, u8 opcode, const void *addr, const void *dest, int size)
{
@@ -115,7 +199,13 @@ void __text_gen_insn(void *buf, u8 opcode, const void *addr, const void *dest, i
insn->opcode = opcode;

if (size > 1) {
- insn->disp = (long)dest - (long)(addr + size);
+#ifdef CONFIG_X86_KERNEL_FINEIBT
+ void *fineibt_dest = __text_fix_fineibt_branch_target(addr,
+ (void *) dest, size);
+ insn->disp = (long) fineibt_dest - (long) (addr + size);
+#else
+ insn->disp = (long) dest - (long) (addr + size);
+#endif
if (size == 2) {
/*
* Ensure that for JMP8 the displacement
--
2.35.1

2022-04-20 18:49:34

by Joao Moreira

[permalink] [raw]
Subject: [RFC PATCH 07/11] x86/lib: Prevent UACCESS call warning from objtool

From: Joao Moreira <[email protected]>

Objtool emits a warning whenever it finds a call that may happen with
UACCESS enabled. Prevent this b not emitting calls to the
__fineibt_handler in such circumstances, making the function
coarse-grained.

Signed-off-by: Joao Moreira <[email protected]>
---
arch/x86/lib/copy_mc.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/lib/copy_mc.c b/arch/x86/lib/copy_mc.c
index 80efd45a7761..554e6c6ecea2 100644
--- a/arch/x86/lib/copy_mc.c
+++ b/arch/x86/lib/copy_mc.c
@@ -22,7 +22,7 @@ void enable_copy_mc_fragile(void)
* Similar to copy_user_handle_tail, probe for the write fault point, or
* source exception point.
*/
-__visible notrace unsigned long
+__visible notrace unsigned long __coarseendbr
copy_mc_fragile_handle_tail(char *to, char *from, unsigned len)
{
for (; len; --len, to++, from++)
--
2.35.1

2022-04-20 22:31:22

by Joao Moreira

[permalink] [raw]
Subject: [RFC PATCH 04/11] x86/module: Support FineIBT in modules

From: Joao Moreira <[email protected]>

Identify direct branch relocations targeting FineIBT hash check
sequence and fix the offset to bypass it.

Invoke objtool with fineibt flag for modules to fix in-module
direct branches too.

Signed-off-by: Joao Moreira <[email protected]>
Tinkered-from-patches-by: Peter Zijlstra <[email protected]>
---
arch/x86/kernel/module.c | 45 +++++++++++++++++++++++++++++++++++++---
scripts/Makefile.build | 1 +
2 files changed, 43 insertions(+), 3 deletions(-)

diff --git a/arch/x86/kernel/module.c b/arch/x86/kernel/module.c
index b98ffcf4d250..4afe71ae3e56 100644
--- a/arch/x86/kernel/module.c
+++ b/arch/x86/kernel/module.c
@@ -128,6 +128,41 @@ int apply_relocate(Elf32_Shdr *sechdrs,
return 0;
}
#else /*X86_64*/
+
+// shamelessly reshaped from PeterZ's IBT patches v2
+static inline void fineibt_branch_fix(void *loc, u64 *val)
+{
+#ifdef CONFIG_X86_KERNEL_FINEIBT
+ const void *addr = (void *)(4 + *val);
+ union text_poke_insn text;
+ u32 insn;
+
+ if (get_kernel_nofault(insn, addr) || !is_endbr(insn))
+ return;
+
+ if (get_kernel_nofault(text, addr+4) || text.opcode != SUB_INSN_OPCODE)
+ return;
+
+ if (get_kernel_nofault(text, addr+11) || text.opcode != JE_INSN_OPCODE)
+ return;
+
+ if (get_kernel_nofault(text, addr+13) ||
+ text.opcode != CALL_INSN_OPCODE)
+ return;
+
+ /* validate jmp.d32/call @ loc */
+ if (WARN_ONCE(get_kernel_nofault(text, loc-1) ||
+ (text.opcode != CALL_INSN_OPCODE &&
+ text.opcode != JMP32_INSN_OPCODE),
+ "Unexpected code at: %pS\n", loc))
+ return;
+
+ DEBUGP("FineIBT fixed direct branch: %pS\n", addr);
+
+ *val += 18;
+#endif
+}
+
static int __apply_relocate_add(Elf64_Shdr *sechdrs,
const char *strtab,
unsigned int symindex,
@@ -139,6 +174,7 @@ static int __apply_relocate_add(Elf64_Shdr *sechdrs,
Elf64_Rela *rel = (void *)sechdrs[relsec].sh_addr;
Elf64_Sym *sym;
void *loc;
+ int type;
u64 val;

DEBUGP("Applying relocate section %u to %u\n",
@@ -153,13 +189,14 @@ static int __apply_relocate_add(Elf64_Shdr *sechdrs,
sym = (Elf64_Sym *)sechdrs[symindex].sh_addr
+ ELF64_R_SYM(rel[i].r_info);

+ type = ELF64_R_TYPE(rel[i].r_info);
+
DEBUGP("type %d st_value %Lx r_addend %Lx loc %Lx\n",
- (int)ELF64_R_TYPE(rel[i].r_info),
- sym->st_value, rel[i].r_addend, (u64)loc);
+ type, sym->st_value, rel[i].r_addend, (u64)loc);

val = sym->st_value + rel[i].r_addend;

- switch (ELF64_R_TYPE(rel[i].r_info)) {
+ switch (type) {
case R_X86_64_NONE:
break;
case R_X86_64_64:
@@ -185,6 +222,8 @@ static int __apply_relocate_add(Elf64_Shdr *sechdrs,
case R_X86_64_PLT32:
if (*(u32 *)loc != 0)
goto invalid_relocation;
+ if (type == R_X86_64_PLT32)
+ fineibt_branch_fix(loc, &val);
val -= (u64)loc;
write(loc, &val, 4);
#if 0
diff --git a/scripts/Makefile.build b/scripts/Makefile.build
index 9717e6f6fb31..d8862673b416 100644
--- a/scripts/Makefile.build
+++ b/scripts/Makefile.build
@@ -230,6 +230,7 @@ objtool_args = \
$(if $(CONFIG_UNWINDER_ORC),orc generate,check) \
$(if $(part-of-module), --module) \
$(if $(CONFIG_X86_KERNEL_IBT), --lto --ibt) \
+ $(if $(CONFIG_X86_KERNEL_FINEIBT), --fineibt) \
$(if $(CONFIG_FRAME_POINTER),, --no-fp) \
$(if $(CONFIG_GCOV_KERNEL)$(CONFIG_LTO_CLANG), --no-unreachable)\
$(if $(CONFIG_RETPOLINE), --retpoline) \
--
2.35.1

2022-04-21 05:03:16

by Joao Moreira

[permalink] [raw]
Subject: [RFC PATCH 10/11] linux/interrupt: Fix prototype matching property

From: Joao Moreira <[email protected]>

Unions will make function pointers with different prototypes be used through
the same call. This leads into the same call instruction being used for
calling functions with different prototypes, making them unsuitable for
prototype-based fine-grained CFI.

Fix this CFI policy violation by removing the function pointer union in
the tasklet struct.

Signed-off-by: Joao Moreira <[email protected]>
---
include/linux/interrupt.h | 6 ++----
1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/include/linux/interrupt.h b/include/linux/interrupt.h
index f40754caaefa..8d5504b0f20b 100644
--- a/include/linux/interrupt.h
+++ b/include/linux/interrupt.h
@@ -650,10 +650,8 @@ struct tasklet_struct
unsigned long state;
atomic_t count;
bool use_callback;
- union {
- void (*func)(unsigned long data);
- void (*callback)(struct tasklet_struct *t);
- };
+ void (*func)(unsigned long data);
+ void (*callback)(struct tasklet_struct *t);
unsigned long data;
};

--
2.35.1

2022-04-21 05:14:42

by Kees Cook

[permalink] [raw]
Subject: Re: [RFC PATCH 00/11] Kernel FineIBT Support

Hi!

On Tue, Apr 19, 2022 at 05:42:30PM -0700, [email protected] wrote:
> I'm considering detaching the prototype fixes from this series and reworking
> them to submit actual fixes (patches 10 and 11). Any specific suggestions for
> these specific patches? Maybe you want to take a look and help in co-authorship
> as we did with the void*-in-x86-crypto patches in the past? I guess these are
> useful for whatever CFI scheme is in place.

Yeah, if 10 and 11 are general prototype-based fixes, let's get them in.
I would expect regular CFI and kCFI to trip over those too. I'll comment
on those patches directly.

> Any other major concerns, ideas, or suggestions? :)

I think it'd be good to get kCFI landed in Clang first (since it is
effectively architecture agnostic), and then get FineIBT landed. But
that doesn't mean we can't be working on the kernel side of things at
the same time.

And just thinking generally, for other architecture-specific stuff,
I do wonder what an arm64 PAC-based CFI might look like. I prefer things
be hard-coded as kCFI is doing, but it'd be nice to be able to directly
measure performance and size overheads comparing the various methods.

--
Kees Cook

2022-04-21 10:26:25

by Joao Moreira

[permalink] [raw]
Subject: [RFC PATCH 08/11] x86/ibt: Add CET_TEST module for IBT testing

From: Joao Moreira <[email protected]>

Add a kernel module that violates IBT policy on load, triggering a
control protection fault and makes the enforcement visible.

Signed-off-by: Joao Moreira <[email protected]>
Tinkered-from-work-by: Alyssa Milburn <[email protected]>
---
arch/x86/Kconfig.debug | 5 +++++
arch/x86/kernel/Makefile | 1 +
arch/x86/kernel/cet_test.c | 30 ++++++++++++++++++++++++++++++
3 files changed, 36 insertions(+)
create mode 100644 arch/x86/kernel/cet_test.c

diff --git a/arch/x86/Kconfig.debug b/arch/x86/Kconfig.debug
index d3a6f74a94bd..d2463dd912c1 100644
--- a/arch/x86/Kconfig.debug
+++ b/arch/x86/Kconfig.debug
@@ -279,3 +279,8 @@ endchoice
config FRAME_POINTER
depends on !UNWINDER_ORC && !UNWINDER_GUESS
bool
+
+config X86_CET_TEST
+ depends on m
+ depends on X86_KERNEL_IBT
+ tristate "in-kernel CET testing module"
diff --git a/arch/x86/kernel/Makefile b/arch/x86/kernel/Makefile
index cb947569e9d8..a82bcd14bd40 100644
--- a/arch/x86/kernel/Makefile
+++ b/arch/x86/kernel/Makefile
@@ -149,6 +149,7 @@ obj-$(CONFIG_UNWINDER_GUESS) += unwind_guess.o

obj-$(CONFIG_AMD_MEM_ENCRYPT) += sev.o
obj-$(CONFIG_X86_KERNEL_FINEIBT) += fineibt.o
+obj-$(CONFIG_X86_CET_TEST) += cet_test.o

###
# 64 bit specific files
diff --git a/arch/x86/kernel/cet_test.c b/arch/x86/kernel/cet_test.c
new file mode 100644
index 000000000000..c48be8cbd0b5
--- /dev/null
+++ b/arch/x86/kernel/cet_test.c
@@ -0,0 +1,30 @@
+// SPDX-License-Identifier: GPL-2.0-only
+
+#include <linux/module.h>
+
+static int cet_test_init(void)
+{
+ pr_info("CET test, expect faults\n");
+
+ // FIXME: use register_die_notifier
+
+ asm volatile(
+ "lea 1f(%%rip), %%rax\n"
+ "jmp *%%rax\n"
+ "nop\n"
+ "1:\n"
+ /* no endbranch */
+ "nop\n"
+ :::"rax"
+ );
+ return 0;
+}
+
+static void cet_test_exit(void)
+{
+}
+
+module_init(cet_test_init);
+module_exit(cet_test_exit);
+
+MODULE_LICENSE("GPL v2");
--
2.35.1

2022-04-21 13:31:52

by Kees Cook

[permalink] [raw]
Subject: Re: [RFC PATCH 10/11] linux/interrupt: Fix prototype matching property

On Tue, Apr 19, 2022 at 05:42:40PM -0700, [email protected] wrote:
> Unions will make function pointers with different prototypes be used through
> the same call. This leads into the same call instruction being used for
> calling functions with different prototypes, making them unsuitable for
> prototype-based fine-grained CFI.

Why? Shouldn't the callers be using different prototypes?

> Fix this CFI policy violation by removing the function pointer union in
> the tasklet struct.

The good news is that tasklet is on the way out the door[1], so this may
quickly become a non-issue, but also to that end, this fix is hardly a
problem for a deprecated API...

-Kees

[1] https://lore.kernel.org/linux-hardening/[email protected]/

>
> Signed-off-by: Joao Moreira <[email protected]>
> ---
> include/linux/interrupt.h | 6 ++----
> 1 file changed, 2 insertions(+), 4 deletions(-)
>
> diff --git a/include/linux/interrupt.h b/include/linux/interrupt.h
> index f40754caaefa..8d5504b0f20b 100644
> --- a/include/linux/interrupt.h
> +++ b/include/linux/interrupt.h
> @@ -650,10 +650,8 @@ struct tasklet_struct
> unsigned long state;
> atomic_t count;
> bool use_callback;
> - union {
> - void (*func)(unsigned long data);
> - void (*callback)(struct tasklet_struct *t);
> - };
> + void (*func)(unsigned long data);
> + void (*callback)(struct tasklet_struct *t);
> unsigned long data;
> };
>
> --
> 2.35.1
>

--
Kees Cook

2022-04-21 13:56:09

by Edgecombe, Rick P

[permalink] [raw]
Subject: Re: [RFC PATCH 00/11] Kernel FineIBT Support

On Tue, 2022-04-19 at 17:42 -0700, [email protected] wrote:
> A debatable point is the fact that on FineIBT the checks are made on
> the callee
> side. On a quick look, this seems to be cool because it allows strict
> reachability refinement of more security-critical functions (like
> hardware
> feature disabling ones) while still allowing other less critical
> functions to be
> relaxed/coarse-grained; under caller-side checks, if one single
> function is
> required to be relaxed, this leads into an indirect call instruction
> being
> relaxed, then becoming a branch capable of reaching all the functions
> in the
> executable address space, including those considered security-
> critical. Inputs
> and opinions on this are very welcome, as there are other
> perspectives about
> this I might be missing.

One minor point: In the course IBT implementation there are places in
the kernel where IBT is toggled because of missing endbranches (calling
into BIOS). So for caller checked solutions, these calls would probably
need to be annotated or something such that caller checks were not
generated.

I haven't been following kCFI, so apologies if this is already handled
somehow.

2022-04-21 16:27:32

by Joao Moreira

[permalink] [raw]
Subject: [RFC PATCH 01/11] x86: kernel FineIBT

From: Joao Moreira <[email protected]>

Make kernel code compatible to be compiled with FineIBT.
- Set FineIBT defines.
- Mark functions reached from assembly as coarse-grained.
- Add FineIBT handler function, which is invoked on policy violations.

Signed-off-by: Joao Moreira <[email protected]>
---
arch/x86/entry/entry_64.S | 1 +
arch/x86/include/asm/ibt.h | 16 ++++
arch/x86/include/asm/setup.h | 12 +--
arch/x86/include/asm/special_insns.h | 4 +-
arch/x86/kernel/cpu/common.c | 2 +-
arch/x86/kernel/fineibt.c | 123 +++++++++++++++++++++++++++
arch/x86/kernel/head64.c | 12 +--
arch/x86/kernel/smpboot.c | 2 +-
8 files changed, 156 insertions(+), 16 deletions(-)
create mode 100644 arch/x86/kernel/fineibt.c

diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S
index 4faac48ebec5..901b702fb16d 100644
--- a/arch/x86/entry/entry_64.S
+++ b/arch/x86/entry/entry_64.S
@@ -295,6 +295,7 @@ SYM_CODE_START(ret_from_fork)
/* kernel thread */
UNWIND_HINT_EMPTY
movq %r12, %rdi
+ FINEIBT_HASH(0x89f22991)
CALL_NOSPEC rbx
/*
* A kernel thread is allowed to return here after successfully
diff --git a/arch/x86/include/asm/ibt.h b/arch/x86/include/asm/ibt.h
index 689880eca9ba..580aa8b83bb2 100644
--- a/arch/x86/include/asm/ibt.h
+++ b/arch/x86/include/asm/ibt.h
@@ -21,6 +21,16 @@

#define HAS_KERNEL_IBT 1

+#if defined(CONFIG_X86_KERNEL_FINEIBT) && !defined(BUILD_VDSO) && !defined(BUILD_VDSO32_64)
+#define HAS_KERNEL_FINEIBT 1
+#define FINEIBT_HASH(hash) mov $hash, %r11d
+#define __coarseendbr __attribute__((coarsecf_check))
+#else
+#define HAS_KERNEL_FINEIBT 0
+#define FINEIBT_HASH(hash)
+#define __coarseendbr
+#endif
+
#ifndef __ASSEMBLY__

#ifdef CONFIG_X86_64
@@ -29,6 +39,7 @@
#define ASM_ENDBR "endbr32\n\t"
#endif

+#undef __noendbr
#define __noendbr __attribute__((nocf_check))

static inline __attribute_const__ u32 gen_endbr(void)
@@ -80,12 +91,17 @@ extern __noendbr void ibt_restore(u64 save);
#else /* !IBT */

#define HAS_KERNEL_IBT 0
+#define HAS_KERNEL_FINEIBT 0
+#define FINEIBT_HASH(hash)

#ifndef __ASSEMBLY__

#define ASM_ENDBR

+#undef __noendbr
#define __noendbr
+#undef __coarseendbr
+#define __coarseendbr

static inline bool is_endbr(u32 val) { return false; }

diff --git a/arch/x86/include/asm/setup.h b/arch/x86/include/asm/setup.h
index 896e48d45828..d2a2f6456403 100644
--- a/arch/x86/include/asm/setup.h
+++ b/arch/x86/include/asm/setup.h
@@ -49,10 +49,10 @@ extern unsigned long saved_video_mode;

extern void reserve_standard_io_resources(void);
extern void i386_reserve_resources(void);
-extern unsigned long __startup_64(unsigned long physaddr, struct boot_params *bp);
-extern unsigned long __startup_secondary_64(void);
-extern void startup_64_setup_env(unsigned long physbase);
-extern void early_setup_idt(void);
+extern unsigned long __coarseendbr __startup_64(unsigned long physaddr, struct boot_params *bp);
+extern unsigned long __coarseendbr __startup_secondary_64(void);
+extern void __coarseendbr startup_64_setup_env(unsigned long physbase);
+extern void __coarseendbr early_setup_idt(void);
extern void __init do_early_exception(struct pt_regs *regs, int trapnr);

#ifdef CONFIG_X86_INTEL_MID
@@ -137,8 +137,8 @@ extern void probe_roms(void);
asmlinkage void __init i386_start_kernel(void);

#else
-asmlinkage void __init x86_64_start_kernel(char *real_mode);
-asmlinkage void __init x86_64_start_reservations(char *real_mode_data);
+asmlinkage void __init __coarseendbr x86_64_start_kernel(char *real_mode);
+asmlinkage void __init __coarseendbr x86_64_start_reservations(char *real_mode_data);

#endif /* __i386__ */
#endif /* _SETUP */
diff --git a/arch/x86/include/asm/special_insns.h b/arch/x86/include/asm/special_insns.h
index 68c257a3de0d..7f32ef8d23f0 100644
--- a/arch/x86/include/asm/special_insns.h
+++ b/arch/x86/include/asm/special_insns.h
@@ -49,7 +49,7 @@ static inline unsigned long __native_read_cr3(void)
return val;
}

-static inline void native_write_cr3(unsigned long val)
+static inline void __coarseendbr native_write_cr3(unsigned long val)
{
asm volatile("mov %0,%%cr3": : "r" (val) : "memory");
}
@@ -74,7 +74,7 @@ static inline unsigned long native_read_cr4(void)
return val;
}

-void native_write_cr4(unsigned long val);
+void __coarseendbr native_write_cr4(unsigned long val);

#ifdef CONFIG_X86_INTEL_MEMORY_PROTECTION_KEYS
static inline u32 rdpkru(void)
diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index ed4417500700..70e94194ee2b 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -463,7 +463,7 @@ void native_write_cr0(unsigned long val)
}
EXPORT_SYMBOL(native_write_cr0);

-void __no_profile native_write_cr4(unsigned long val)
+void __no_profile __coarseendbr native_write_cr4(unsigned long val)
{
unsigned long bits_changed = 0;

diff --git a/arch/x86/kernel/fineibt.c b/arch/x86/kernel/fineibt.c
new file mode 100644
index 000000000000..685e4308d86e
--- /dev/null
+++ b/arch/x86/kernel/fineibt.c
@@ -0,0 +1,123 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * This file contains the FineIBT handler function.
+ */
+
+#include <linux/export.h>
+#include <linux/printk.h>
+#include <linux/kernel.h>
+#include<linux/spinlock.h>
+#include <asm/ibt.h>
+
+void __noendbr __fineibt_handler(void);
+
+void __fineibt_debug(void) {
+ asm volatile ("nop\n");
+ printk("fineibt debug\n");
+};
+EXPORT_SYMBOL(__fineibt_debug);
+
+#define FINEIBT_VADDR_LEN 4096
+#define DO_ALL_PUSHS \
+ asm("nop\n\t" \
+ "push %rsi\n\t" \
+ "push %rdi\n\t" \
+ "push %rdx\n\t" \
+ "push %rcx\n\t" \
+ "push %rbx\n\t" \
+ "push %rax\n\t" \
+ "push %r8\n\t" \
+ "push %r9\n\t" \
+ "push %r10\n\t" \
+ "push %r11\n\t" \
+ "push %r12\n\t" \
+ "push %r13\n\t" \
+ "push %r14\n\t" \
+ "push %r15\n\t")
+
+#define DO_ALL_POPS \
+ asm("nop\n\t" \
+ "pop %r15\n\t" \
+ "pop %r14\n\t" \
+ "pop %r13\n\t" \
+ "pop %r12\n\t" \
+ "pop %r11\n\t" \
+ "pop %r10\n\t" \
+ "pop %r9\n\t" \
+ "pop %r8\n\t" \
+ "pop %rax\n\t" \
+ "pop %rbx\n\t" \
+ "pop %rcx\n\t" \
+ "pop %rdx\n\t" \
+ "pop %rdi\n\t" \
+ "pop %rsi\n\t")
+
+struct fineibt_violation{
+ void * vaddr;
+ void * caddr;
+ bool printed;
+};
+
+typedef struct fineibt_violation fineibt_violation;
+
+static fineibt_violation vlts[FINEIBT_VADDR_LEN];
+static unsigned long vlts_next = 0;
+static bool vlts_initialize = true;
+static DEFINE_SPINLOCK(fineibt_lock);
+
+void __noendbr __fineibt_handler(void){
+ unsigned i;
+ unsigned long flags;
+ bool skip;
+ void * ret;
+ void * caller;
+
+ DO_ALL_PUSHS;
+
+ spin_lock_irqsave(&fineibt_lock, flags);
+ skip = false;
+
+ asm("\t movq 0x90(%%rsp),%0" : "=r"(ret));
+ asm("\t movq 0x98(%%rsp),%0" : "=r"(caller));
+
+ if(vlts_initialize){
+ for(i = 0; i < FINEIBT_VADDR_LEN; i++) {
+ vlts[i].vaddr = 0;
+ vlts[i].caddr = 0;
+ vlts[i].printed = 0;
+ }
+ vlts_initialize = false;
+ }
+
+ if(vlts_next >= FINEIBT_VADDR_LEN) {
+ if(vlts_next == FINEIBT_VADDR_LEN) {
+ printk("FineIBT reached max buffer\n");
+ vlts_next++;
+ }
+ skip = true;
+ }
+
+ for(i = 0; i < vlts_next; i++){
+ if(vlts[i].vaddr == ret && vlts[i].caddr == caller) {
+ skip = true;
+ break;
+ }
+ }
+
+ if(!skip) {
+ vlts[vlts_next].vaddr = ret;
+ vlts[vlts_next].caddr = caller;
+ vlts[vlts_next].printed = 0;
+ vlts_next = vlts_next + 1;
+ }
+
+ spin_unlock_irqrestore(&fineibt_lock, flags);
+
+ if(!skip) {
+ printk("FineIBT violation: %px:%px:%u\n", ret, caller,
+ vlts_next);
+ }
+ DO_ALL_POPS;
+}
+
+EXPORT_SYMBOL(__fineibt_handler);
diff --git a/arch/x86/kernel/head64.c b/arch/x86/kernel/head64.c
index 4f5ecbbaae77..f773d771e07d 100644
--- a/arch/x86/kernel/head64.c
+++ b/arch/x86/kernel/head64.c
@@ -162,7 +162,7 @@ static unsigned long __head sme_postprocess_startup(struct boot_params *bp, pmdv
* boot-time crashes. To work around this problem, every global pointer must
* be adjusted using fixup_pointer().
*/
-unsigned long __head __startup_64(unsigned long physaddr,
+unsigned long __head __coarseendbr __startup_64(unsigned long physaddr,
struct boot_params *bp)
{
unsigned long load_delta, *p;
@@ -308,7 +308,7 @@ unsigned long __head __startup_64(unsigned long physaddr,
return sme_postprocess_startup(bp, pmd);
}

-unsigned long __startup_secondary_64(void)
+unsigned long __coarseendbr __startup_secondary_64(void)
{
/*
* Return the SME encryption mask (if SME is active) to be used as a
@@ -464,8 +464,8 @@ static void __init copy_bootdata(char *real_mode_data)
sme_unmap_bootdata(real_mode_data);
}

-asmlinkage __visible void __init x86_64_start_kernel(char * real_mode_data)
-{
+asmlinkage __visible void __init __coarseendbr
+x86_64_start_kernel(char * real_mode_data) {
/*
* Build-time sanity checks on the kernel image and module
* area mappings. (these are purely build-time and produce no code)
@@ -597,7 +597,7 @@ static void startup_64_load_idt(unsigned long physbase)
}

/* This is used when running on kernel addresses */
-void early_setup_idt(void)
+void __coarseendbr early_setup_idt(void)
{
/* VMM Communication Exception */
if (IS_ENABLED(CONFIG_AMD_MEM_ENCRYPT))
@@ -610,7 +610,7 @@ void early_setup_idt(void)
/*
* Setup boot CPU state needed before kernel switches to virtual addresses.
*/
-void __head startup_64_setup_env(unsigned long physbase)
+void __head __coarseendbr startup_64_setup_env(unsigned long physbase)
{
/* Load GDT */
startup_gdt_descr.address = (unsigned long)fixup_pointer(startup_gdt, physbase);
diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
index 2ef14772dc04..5fa17d5716bb 100644
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -214,7 +214,7 @@ static int enable_start_cpu0;
/*
* Activate a secondary processor.
*/
-static void notrace start_secondary(void *unused)
+static void notrace __coarseendbr start_secondary(void *unused)
{
/*
* Don't put *anything* except direct CPU state initialization
--
2.35.1

2022-04-22 06:39:01

by H.J. Lu

[permalink] [raw]
Subject: Re: [RFC PATCH 00/11] Kernel FineIBT Support

On Thu, Apr 21, 2022 at 8:23 AM Joao Moreira <[email protected]> wrote:
>
> On 2022-04-21 00:49, Peter Zijlstra wrote:
> > On Wed, Apr 20, 2022 at 03:40:41PM -0700, Joao Moreira wrote:
> >> > >
> >> > > If FineIBT needs it, I could reconsider. But I think there's a strong
> >> > > case to be made that the linker should be doing that instead.
> >> >
> >> > That sounds reasonable to me (and reminds me of linker relaxation).
> >> > Joao, can you please work with Fangrui (LLD) and HJ (GNU binutils) to
> >> > determine how feasible this would be? I assume code outside the kernel
> >> > might enjoy such an optimization, too. When that's the case, then it
> >> > probably makes more sense to "upstream" such "optimizations" from the
> >> > kernel-specific objtool into the toolchains.
> >>
> >> Alright, these are the greenlights I was hoping for.
> >>
> >> I went quickly into this with HJ and he mentioned that it should be
> >> doable
> >> in the linker, and that he has a patch for it in gcc (for local
> >> function,
> >> from what I could see):
> >> https://gcc.gnu.org/pipermail/gcc-patches/2022-February/590832.html
> >>
> >> If @Fangrui is fine with it, I would like to try implementing this
> >> myself in
> >> lld (I'm still learning a lot about lld and having an actual problem
> >> to
> >> solve is the kind of fuel I need). Should take me a while, but I think
> >> this
> >> is not urgent, right? I can also go ahead and replicate HJ's gcc patch
> >> into
> >> clang, so we can also handle the local functions within the compiler
> >> (I
> >> think this makes a lot of sense).
> >>
> >> Once we have these in, I'll revisit FineIBT and extend the features to
> >> handle the FineIBT instrumentation. Hopefully we'll be released from
> >> needing
> >> objtool (famous last words?!).
> >>
> >> This sounds like a plan, but I'm ofc open to suggestions or different
> >> ideas/plans.
> >
> > So trivially the plan sounds like: compiler fixes STB_LOCAL because it
> > has the scope, and the linker fixes everything else. However, that
> > seems
> > to assume that !STB_LOCAL will have ENDBR.
> >
> > This latter isn't true; for one there's __attribute__((nocf_check))
> > that
> > can be used to suppress ENDBR generation on a function.
> >
> > Alternatively the linker will need to 'read' the function to determine
> > if it has ENDBR, or we need to augment the ELF format such that we can
> > tell from that.
> >
> > So what exactly is the plan?
>
> I ran into too many broken dreams by trying to infer the presence of
> ENDBRs just by the symbol locality/linkage... not only because of the
> attribute, but also because of ancient assembly.
>
> So, my first thought was to use something similar to the
> __patchable_function_entries section
> (https://man7.org/linux/man-pages/man1/gcc.1.html), where we would have
> a section to mark all the placed ENDBR. But then it occurred to me that
> if we follow that road we'll miss the ENDBR placed in assembly unless we
> mark it manually, so I started thinking that reading the target
> instructions from the ELF object could be a more simplified approach,
> although a little more treacherous.
>
> I didn't decide yet what to try first -- any thoughts?
>
> @Fangrui's and @HJ's thoughts about this could be gold.

You can't assume ENDBR existence just by symbol visibility.
Compiler knows if there is an ENDBR at function entry since
it is generated by compiler. Otherwise, you need to check
the first 4 bytes at function entry,

--
H.J.

2022-04-22 08:12:49

by Josh Poimboeuf

[permalink] [raw]
Subject: Re: [RFC PATCH 00/11] Kernel FineIBT Support

On Wed, Apr 20, 2022 at 09:40:44AM +0200, Peter Zijlstra wrote:
> On Tue, Apr 19, 2022 at 05:42:30PM -0700, [email protected] wrote:
> > @PeterZ @JoshP
> >
> > I'm a bit unaware of the details on why the objtool approach to bypass ENDBRs
> > was removed from the IBT series. Is this approach now sensible considering that
> > it is a requirement for a new/enhanced feature? If not, how extending the Linker
> > to emit already fixed offsets sounds like?
>
> Josh hates objtool modifying actualy code. He much prefers objtool only
> emits out of band data.
>
> Now, I did sneak in that jump_label nop'ing, and necessity (broken
> compilers) had us do the KCOV nop'ing in noinstr, but if you look at the
> recent objtool series here:
>
> https://lkml.kernel.org/r/[email protected]
>
> you'll see his thoughs on that :-)
>
> Now, I obviously don't mind, it's easy enough to figure out what objtool
> actually does with something like:
>
> $ OBJTOOL_ARGS="--backup" make O=ibt-build/ -j$lots vmlinux
> $ objdiff.sh ibt-build/vmlinux.o
>
> Where objdiff.sh is the below crummy script.
>
> Now, one compromise that I did get out of Josh was that he objected less
> to rewriting relocations than to rewriting the immediates. From my
> testing the relocations got us the vast majority of direct call sites,
> very few are immediates.
>
> Josh, any way you might reconsider all that? :-)

If I remember correctly, the goal of --ibt-fix-direct was to avoid
hitting unnecessary ENDBRs, which basically decode to NOPs, so the
ghastly hack wasn't worth it.

If FineIBT needs it, I could reconsider. But I think there's a strong
case to be made that the linker should be doing that instead.

--
Josh

2022-04-22 08:21:28

by Joao Moreira

[permalink] [raw]
Subject: [RFC PATCH 09/11] x86/FineIBT: Add FINEIBT_TEST module

From: Joao Moreira <[email protected]>

Adds a module that on load will call a function directly ensuring that
FineIBT fixes for module relocations are working as expected. Next the
module invokes another function indirectly, with a wrong hash into R11,
causing a violation to be triggered (and the __fineibt_handler to be
invoked).

Signed-off-by: Joao Moreira <[email protected]>
---
arch/x86/Kconfig.debug | 5 +++++
arch/x86/kernel/Makefile | 1 +
arch/x86/kernel/fineibt_test.c | 39 ++++++++++++++++++++++++++++++++++
3 files changed, 45 insertions(+)
create mode 100644 arch/x86/kernel/fineibt_test.c

diff --git a/arch/x86/Kconfig.debug b/arch/x86/Kconfig.debug
index d2463dd912c1..4a5617c2470d 100644
--- a/arch/x86/Kconfig.debug
+++ b/arch/x86/Kconfig.debug
@@ -284,3 +284,8 @@ config X86_CET_TEST
depends on m
depends on X86_KERNEL_IBT
tristate "in-kernel CET testing module"
+
+config X86_FINEIBT_TEST
+ depends on m
+ depends on X86_KERNEL_FINEIBT
+ tristate "in-kernel FineIBT testing module"
diff --git a/arch/x86/kernel/Makefile b/arch/x86/kernel/Makefile
index a82bcd14bd40..5d7f39f3d909 100644
--- a/arch/x86/kernel/Makefile
+++ b/arch/x86/kernel/Makefile
@@ -150,6 +150,7 @@ obj-$(CONFIG_UNWINDER_GUESS) += unwind_guess.o
obj-$(CONFIG_AMD_MEM_ENCRYPT) += sev.o
obj-$(CONFIG_X86_KERNEL_FINEIBT) += fineibt.o
obj-$(CONFIG_X86_CET_TEST) += cet_test.o
+obj-$(CONFIG_X86_FINEIBT_TEST) += fineibt_test.o

###
# 64 bit specific files
diff --git a/arch/x86/kernel/fineibt_test.c b/arch/x86/kernel/fineibt_test.c
new file mode 100644
index 000000000000..c8cbff6208f8
--- /dev/null
+++ b/arch/x86/kernel/fineibt_test.c
@@ -0,0 +1,39 @@
+// SPDX-License-Identifier: GPL-2.0-only
+
+#include <linux/module.h>
+
+void __fineibt_debug(void);
+
+void fineibt_foo(void) {
+ pr_info("FineIBT: dmesg should show a FineIBT violation message.\n");
+}
+
+void fineibt_bar(void) {
+ pr_info("FineIBT: this first one should run smoothly.\n");
+}
+
+static int fineibt_test_init(void)
+{
+ pr_info("FineIBT test\n");
+
+ __fineibt_debug();
+
+ asm volatile(
+ "call fineibt_bar\n"
+ "lea fineibt_foo(%%rip), %%rax\n"
+ "mov $0xdeadbeef, %%r11\n"
+ "call *%%rax\n"
+ /* this should trigger the handler because the hash is wrong */
+ ::: "rax"
+ );
+ return 0;
+}
+
+static void fineibt_test_exit(void)
+{
+}
+
+module_init(fineibt_test_init);
+module_exit(fineibt_test_exit);
+
+MODULE_LICENSE("GPL v2");
--
2.35.1

2022-04-22 14:11:44

by H.J. Lu

[permalink] [raw]
Subject: Re: [RFC PATCH 00/11] Kernel FineIBT Support

On Thu, Apr 21, 2022 at 3:11 PM Fangrui Song <[email protected]> wrote:
>
> On 2022-04-21, H.J. Lu wrote:
> >On Thu, Apr 21, 2022 at 8:23 AM Joao Moreira <[email protected]> wrote:
> >>
> >> On 2022-04-21 00:49, Peter Zijlstra wrote:
> >> > On Wed, Apr 20, 2022 at 03:40:41PM -0700, Joao Moreira wrote:
> >> >> > >
> >> >> > > If FineIBT needs it, I could reconsider. But I think there's a strong
> >> >> > > case to be made that the linker should be doing that instead.
> >> >> >
> >> >> > That sounds reasonable to me (and reminds me of linker relaxation).
> >> >> > Joao, can you please work with Fangrui (LLD) and HJ (GNU binutils) to
> >> >> > determine how feasible this would be? I assume code outside the kernel
> >> >> > might enjoy such an optimization, too. When that's the case, then it
> >> >> > probably makes more sense to "upstream" such "optimizations" from the
> >> >> > kernel-specific objtool into the toolchains.
> >> >>
> >> >> Alright, these are the greenlights I was hoping for.
> >> >>
> >> >> I went quickly into this with HJ and he mentioned that it should be
> >> >> doable
> >> >> in the linker, and that he has a patch for it in gcc (for local
> >> >> function,
> >> >> from what I could see):
> >> >> https://gcc.gnu.org/pipermail/gcc-patches/2022-February/590832.html
> >> >>
> >> >> If @Fangrui is fine with it, I would like to try implementing this
> >> >> myself in
> >> >> lld (I'm still learning a lot about lld and having an actual problem
> >> >> to
> >> >> solve is the kind of fuel I need). Should take me a while, but I think
> >> >> this
> >> >> is not urgent, right? I can also go ahead and replicate HJ's gcc patch
> >> >> into
> >> >> clang, so we can also handle the local functions within the compiler
> >> >> (I
> >> >> think this makes a lot of sense).
> >> >>
> >> >> Once we have these in, I'll revisit FineIBT and extend the features to
> >> >> handle the FineIBT instrumentation. Hopefully we'll be released from
> >> >> needing
> >> >> objtool (famous last words?!).
> >> >>
> >> >> This sounds like a plan, but I'm ofc open to suggestions or different
> >> >> ideas/plans.
>
> Thanks for looping me in! (I mean: this is discussed beforehand, instead
> of GCC/GNU ld taking some steps and pushing LLVM toolchain to follow
> suite..)
> Though I think I may not have the full context here...
>
> I can see that you have a request to skip the endbr instruction
> (and potentially more instructions for FineIBT).
>
> The GCC patch https://gcc.gnu.org/pipermail/gcc-patches/2022-February/590832.html
> achieves it. A linker implementation will handle more cases.
> This is somewhat similar to PowerPC64's global entry vs local entry.
> The linker can redirect a branch instruction to the local entry in some
> conditions. My concern is that inspecting the section content goes too far
> and breaks the spirit of ELF relocation resolving. The proper way is to use
> a symbol attribute (e.g. st_other).
>
> st_other bits are scarce, so any use needs to be prudent.

On the other hand, linker inspection doesn't require changes in object files.
It is much more user friendly. X86-64 psABI already allows linker optimization
based on contents at relocation sites. This goes one step further to contents
at relocation targets.

> >> > So trivially the plan sounds like: compiler fixes STB_LOCAL because it
> >> > has the scope, and the linker fixes everything else. However, that
> >> > seems
> >> > to assume that !STB_LOCAL will have ENDBR.
> >> >
> >> > This latter isn't true; for one there's __attribute__((nocf_check))
> >> > that
> >> > can be used to suppress ENDBR generation on a function.
> >> >
> >> > Alternatively the linker will need to 'read' the function to determine
> >> > if it has ENDBR, or we need to augment the ELF format such that we can
> >> > tell from that.
> >> >
> >> > So what exactly is the plan?
> >>
> >> I ran into too many broken dreams by trying to infer the presence of
> >> ENDBRs just by the symbol locality/linkage... not only because of the
> >> attribute, but also because of ancient assembly.
> >>
> >> So, my first thought was to use something similar to the
> >> __patchable_function_entries section
> >> (https://man7.org/linux/man-pages/man1/gcc.1.html), where we would have
> >> a section to mark all the placed ENDBR. But then it occurred to me that
> >> if we follow that road we'll miss the ENDBR placed in assembly unless we
> >> mark it manually, so I started thinking that reading the target
> >> instructions from the ELF object could be a more simplified approach,
> >> although a little more treacherous.
> >>
> >> I didn't decide yet what to try first -- any thoughts?
> >>
> >> @Fangrui's and @HJ's thoughts about this could be gold.
> >
> >You can't assume ENDBR existence just by symbol visibility.
> >Compiler knows if there is an ENDBR at function entry since
> >it is generated by compiler. Otherwise, you need to check
> >the first 4 bytes at function entry,
> >
> >--
> >H.J.



--
H.J.

2022-04-22 17:26:10

by Joao Moreira

[permalink] [raw]
Subject: Re: [RFC PATCH 00/11] Kernel FineIBT Support

>>
>> If FineIBT needs it, I could reconsider. But I think there's a strong
>> case to be made that the linker should be doing that instead.
>
> That sounds reasonable to me (and reminds me of linker relaxation).
> Joao, can you please work with Fangrui (LLD) and HJ (GNU binutils) to
> determine how feasible this would be? I assume code outside the kernel
> might enjoy such an optimization, too. When that's the case, then it
> probably makes more sense to "upstream" such "optimizations" from the
> kernel-specific objtool into the toolchains.

Alright, these are the greenlights I was hoping for.

I went quickly into this with HJ and he mentioned that it should be
doable in the linker, and that he has a patch for it in gcc (for local
function, from what I could see):
https://gcc.gnu.org/pipermail/gcc-patches/2022-February/590832.html

If @Fangrui is fine with it, I would like to try implementing this
myself in lld (I'm still learning a lot about lld and having an actual
problem to solve is the kind of fuel I need). Should take me a while,
but I think this is not urgent, right? I can also go ahead and replicate
HJ's gcc patch into clang, so we can also handle the local functions
within the compiler (I think this makes a lot of sense).

Once we have these in, I'll revisit FineIBT and extend the features to
handle the FineIBT instrumentation. Hopefully we'll be released from
needing objtool (famous last words?!).

This sounds like a plan, but I'm ofc open to suggestions or different
ideas/plans.

Tks,
Joao

2022-04-22 18:28:59

by Joao Moreira

[permalink] [raw]
Subject: [RFC PATCH 03/11] objtool: Support FineIBT offset fixes

From: Joao Moreira <[email protected]>

Direct branches can't go into FineIBT hash check operations. Identify
these in the binary and correct the branch offset to skip it. If the
offset correction cannot be placed because the operand is too small for
the jump, patch the FineIBT hash check operation with nops.

This patch is a re-spin of Peter Zijlstra's objtool support from the IBT
series V2.

Signed-off-by: Joao Moreira <[email protected]>
Tinkered-from-patches-by: Peter Zijlstra <[email protected]>
---
scripts/link-vmlinux.sh | 8 +
tools/objtool/arch/x86/decode.c | 175 +++++++++++++----
tools/objtool/arch/x86/include/arch/special.h | 2 +
tools/objtool/builtin-check.c | 3 +-
tools/objtool/check.c | 183 +++++++++++++++++-
tools/objtool/include/objtool/arch.h | 3 +
tools/objtool/include/objtool/builtin.h | 2 +-
7 files changed, 332 insertions(+), 44 deletions(-)

diff --git a/scripts/link-vmlinux.sh b/scripts/link-vmlinux.sh
index 20f44504a644..2d657c0232ee 100755
--- a/scripts/link-vmlinux.sh
+++ b/scripts/link-vmlinux.sh
@@ -126,6 +126,10 @@ objtool_link()
if is_enabled CONFIG_FTRACE_MCOUNT_USE_OBJTOOL; then
objtoolopt="${objtoolopt} --mcount"
fi
+
+ if is_enabled CONFIG_X86_KERNEL_FINEIBT; then
+ objtoolopt="${objtoolopt} --fineibt"
+ fi
fi

if is_enabled CONFIG_VMLINUX_VALIDATION; then
@@ -152,6 +156,9 @@ objtool_link()
if is_enabled CONFIG_SLS; then
objtoolopt="${objtoolopt} --sls"
fi
+ if is_enabled CONFIG_X86_KERNEL_FINEIBT; then
+ objtoolopt="${objtoolopt} --fineibt"
+ fi
info OBJTOOL ${1}
tools/objtool/objtool ${objtoolcmd} ${objtoolopt} ${1}
fi
@@ -175,6 +182,7 @@ vmlinux_link()
shift

if is_enabled CONFIG_LTO_CLANG || is_enabled CONFIG_X86_KERNEL_IBT; then
+ #if is_enabled CONFIG_LTO_CLANG; then
# Use vmlinux.o instead of performing the slow LTO link again.
objs=vmlinux.o
libs=
diff --git a/tools/objtool/arch/x86/decode.c b/tools/objtool/arch/x86/decode.c
index 943cb41cddf7..766a234faa03 100644
--- a/tools/objtool/arch/x86/decode.c
+++ b/tools/objtool/arch/x86/decode.c
@@ -22,6 +22,7 @@
#include <objtool/endianness.h>
#include <objtool/builtin.h>
#include <arch/elf.h>
+#include <arch/special.h>

static int is_x86_64(const struct elf *elf)
{
@@ -241,54 +242,61 @@ int arch_decode_instruction(struct objtool_file *file, const struct section *sec
* 011 SBB 111 CMP
*/

- /* 64bit only */
- if (!rex_w)
- break;
+ /* %rsp target */
+ if (rm_is_reg(CFI_SP)) {

- /* %rsp target only */
- if (!rm_is_reg(CFI_SP))
- break;
+ /* 64 bit only */
+ if (!rex_w)
+ break;

- imm = insn.immediate.value;
- if (op1 & 2) { /* sign extend */
- if (op1 & 1) { /* imm32 */
- imm <<= 32;
- imm = (s64)imm >> 32;
- } else { /* imm8 */
- imm <<= 56;
- imm = (s64)imm >> 56;
+ imm = insn.immediate.value;
+ if (op1 & 2) { /* sign extend */
+ if (op1 & 1) { /* imm32 */
+ imm <<= 32;
+ imm = (s64)imm >> 32;
+ } else { /* imm8 */
+ imm <<= 56;
+ imm = (s64)imm >> 56;
+ }
}
- }

- switch (modrm_reg & 7) {
- case 5:
- imm = -imm;
- /* fallthrough */
- case 0:
- /* add/sub imm, %rsp */
- ADD_OP(op) {
- op->src.type = OP_SRC_ADD;
- op->src.reg = CFI_SP;
- op->src.offset = imm;
- op->dest.type = OP_DEST_REG;
- op->dest.reg = CFI_SP;
- }
- break;
+ switch (modrm_reg & 7) {
+ case 5:
+ imm = -imm;
+ /* fallthrough */
+ case 0:
+ /* add/sub imm, %rsp */
+ ADD_OP(op) {
+ op->src.type = OP_SRC_ADD;
+ op->src.reg = CFI_SP;
+ op->src.offset = imm;
+ op->dest.type = OP_DEST_REG;
+ op->dest.reg = CFI_SP;
+ }
+ break;

- case 4:
- /* and imm, %rsp */
- ADD_OP(op) {
- op->src.type = OP_SRC_AND;
- op->src.reg = CFI_SP;
- op->src.offset = insn.immediate.value;
- op->dest.type = OP_DEST_REG;
- op->dest.reg = CFI_SP;
+ case 4:
+ /* and imm, %rsp */
+ ADD_OP(op) {
+ op->src.type = OP_SRC_AND;
+ op->src.reg = CFI_SP;
+ op->src.offset = insn.immediate.value;
+ op->dest.type = OP_DEST_REG;
+ op->dest.reg = CFI_SP;
+ }
+ break;
+
+ default:
+ /* WARN ? */
+ break;
}
- break;
+ }

- default:
- /* WARN ? */
- break;
+ if (rm_is_reg(CFI_R11)) {
+ if (op1 == 0x81) {
+ *type = INSN_SUB_R11;
+ break;
+ }
}

break;
@@ -729,6 +737,91 @@ const char *arch_nop_insn(int len)
return nops[len-1];
}

+const char *arch_bypass_fineibt()
+{
+ // FineIBT skip is:
+ // xor r11, r11 (to destroy any hash contents in r11 and prevent reuse)
+ // jmp (jump to the beginning of function and skip whatever)
+
+ // AFTER_FINEIBT = FineIBT len - endbr len - xor len - jmp len
+#define AFTER_FINEIBT FINEIBT_FIXUP - 4 - 3 - 2
+ static const char bytes[7] = { 0x4d, 0x31, 0xdb, 0xeb,
+ AFTER_FINEIBT , BYTES_NOP2};
+ return bytes;
+}
+
+const char *arch_mod_immediate(struct instruction *insn, unsigned long target)
+{
+ struct section *sec = insn->sec;
+ Elf_Data *data = sec->data;
+ unsigned char op1, op2;
+ static char bytes[16];
+ struct insn x86_insn;
+ int ret, disp;
+
+ disp = (long)(target - (insn->offset + insn->len));
+
+ if (data->d_type != ELF_T_BYTE || data->d_off) {
+ WARN("unexpected data for section: %s", sec->name);
+ return NULL;
+ }
+
+ ret = insn_decode(&x86_insn, data->d_buf + insn->offset, insn->len,
+ INSN_MODE_64);
+ if (ret < 0) {
+ WARN("can't decode instruction at %s:0x%lx", sec->name,
+ insn->offset);
+ return NULL;
+ }
+
+ op1 = x86_insn.opcode.bytes[0];
+ op2 = x86_insn.opcode.bytes[1];
+
+ switch (op1) {
+ case 0x0f: /* escape */
+ switch (op2) {
+ case 0x80 ... 0x8f: /* jcc.d32 */
+ if (insn->len != 6)
+ return NULL;
+ bytes[0] = op1;
+ bytes[1] = op2;
+ *(int *)&bytes[2] = disp;
+ break;
+
+ default:
+ return NULL;
+ }
+ break;
+
+ case 0x70 ... 0x7f: /* jcc.d8 */
+ case 0xeb: /* jmp.d8 */
+ if (insn->len != 2)
+ return NULL;
+
+ if (disp >> 7 != disp >> 31) {
+ WARN("Jump displacement doesn't fit");
+ return NULL;
+ }
+
+ bytes[0] = op1;
+ bytes[1] = disp & 0xff;
+ break;
+
+ case 0xe8: /* call */
+ case 0xe9: /* jmp.d32 */
+ if (insn->len != 5)
+ return NULL;
+ bytes[0] = op1;
+ *(int *)&bytes[1] = disp;
+ break;
+
+ default:
+ return NULL;
+ }
+
+ return bytes;
+}
+
#define BYTE_RET 0xC3

const char *arch_ret_insn(int len)
diff --git a/tools/objtool/arch/x86/include/arch/special.h b/tools/objtool/arch/x86/include/arch/special.h
index f2918f789a0a..f36525ecc7ec 100644
--- a/tools/objtool/arch/x86/include/arch/special.h
+++ b/tools/objtool/arch/x86/include/arch/special.h
@@ -18,4 +18,6 @@
#define ALT_ORIG_LEN_OFFSET 10
#define ALT_NEW_LEN_OFFSET 11

+#define FINEIBT_FIXUP 18
+
#endif /* _X86_ARCH_SPECIAL_H */
diff --git a/tools/objtool/builtin-check.c b/tools/objtool/builtin-check.c
index fc6975ab8b06..2b3813726605 100644
--- a/tools/objtool/builtin-check.c
+++ b/tools/objtool/builtin-check.c
@@ -21,7 +21,7 @@

bool no_fp, no_unreachable, retpoline, module, backtrace, uaccess, stats,
lto, vmlinux, mcount, noinstr, backup, sls, dryrun,
- ibt;
+ ibt, fineibt;

static const char * const check_usage[] = {
"objtool check [<options>] file.o",
@@ -49,6 +49,7 @@ const struct option check_options[] = {
OPT_BOOLEAN('S', "sls", &sls, "validate straight-line-speculation"),
OPT_BOOLEAN(0, "dry-run", &dryrun, "don't write the modifications"),
OPT_BOOLEAN(0, "ibt", &ibt, "validate ENDBR placement"),
+ OPT_BOOLEAN(0, "fineibt", &fineibt, "fix FineIBT direct calls"),
OPT_END(),
};

diff --git a/tools/objtool/check.c b/tools/objtool/check.c
index bd0c2c828940..e5b85ad40454 100644
--- a/tools/objtool/check.c
+++ b/tools/objtool/check.c
@@ -8,6 +8,7 @@
#include <sys/mman.h>

#include <arch/elf.h>
+#include <arch/special.h>
#include <objtool/builtin.h>
#include <objtool/cfi.h>
#include <objtool/arch.h>
@@ -27,7 +28,7 @@ struct alternative {
bool skip_orig;
};

-static unsigned long nr_cfi, nr_cfi_reused, nr_cfi_cache;
+static unsigned long nr_cfi, nr_cfi_reused, nr_cfi_cache, nr_fineibt;

static struct cfi_init_state initial_func_cfi;
static struct cfi_state init_cfi;
@@ -1211,6 +1212,173 @@ static void add_call_dest(struct objtool_file *file, struct instruction *insn,
annotate_call_site(file, insn, sibling);
}

+static bool is_fineibt_sequence(struct objtool_file *file,
+ struct section *dest_sec, unsigned long dest_off) {
+
+ struct instruction *insn;
+
+ insn = find_insn(file, dest_sec, dest_off);
+ if (insn->type != INSN_ENDBR)
+ return false;
+
+ insn = find_insn(file, dest_sec, dest_off+4);
+ arch_decode_instruction(file, dest_sec, dest_off+4,
+ dest_sec->sh.sh_size - (dest_off+4),
+ &insn->len, &insn->type, &insn->immediate,
+ &insn->stack_ops);
+ if (insn->type != INSN_SUB_R11)
+ return false;
+
+ insn = find_insn(file, dest_sec, dest_off+11);
+ arch_decode_instruction(file, dest_sec, dest_off+11,
+ dest_sec->sh.sh_size - (dest_off+11),
+ &insn->len, &insn->type, &insn->immediate,
+ &insn->stack_ops);
+ if (insn->type != INSN_JUMP_CONDITIONAL)
+ return false;
+
+ // XXX: when call __fineibt_handler is replaced by ud2, check it here.
+ return true;
+}
+
+static bool nopout_jmp_target(struct objtool_file *file,
+ struct instruction *jmp) {
+ struct instruction *tgt = jmp->jump_dest;
+ const char *op = arch_bypass_fineibt();
+
+ if (is_fineibt_sequence(file, tgt->sec, tgt->offset)) {
+ tgt = next_insn_same_func(file, tgt);
+ elf_write_insn(file->elf, tgt->sec, tgt->offset, 7, op);
+ return true;
+ }
+ return false;
+}
+
+static void fix_fineibt_call(struct objtool_file *file,
+ struct instruction *insn) {
+
+ unsigned long dest_off;
+ struct instruction *target = NULL;
+ struct reloc *reloc = insn_reloc(file, insn);
+ const char *op = NULL;
+
+ if (!reloc) {
+ dest_off = arch_jump_destination(insn);
+ target = find_insn(file, insn->sec,
+ dest_off);
+
+ } else if (reloc->sym->retpoline_thunk) {
+ return;
+
+ } else {
+ dest_off = reloc->sym->offset +
+ arch_dest_reloc_offset(reloc->addend);
+ target = find_insn(file, reloc->sym->sec, dest_off);
+ }
+
+ if (target && target->type == INSN_ENDBR
+ && is_fineibt_sequence(file, target->sec, dest_off)) {
+ if (reloc) {
+ reloc->addend += FINEIBT_FIXUP;
+ elf_write_reloc(file->elf, reloc);
+ } else {
+ op = arch_mod_immediate(insn, dest_off + FINEIBT_FIXUP);
+ if (op) {
+ elf_write_insn(file->elf, insn->sec,
+ insn->offset, insn->len, op);
+ } else {
+ WARN_FUNC("Can't fix direct call to FineIBT",
+ insn->sec, insn->offset);
+ return;
+ }
+ }
+ }
+}
+
+static void fix_fineibt_jump(struct objtool_file *file,
+ struct instruction *insn) {
+
+ unsigned long dest_off;
+ struct section *dest_sec;
+ struct reloc *reloc = insn_reloc(file, insn);
+ const char *op = NULL;
+
+ if (!reloc) {
+ dest_sec = insn->sec;
+ dest_off = arch_jump_destination(insn);
+
+ } else if (reloc->sym->type == STT_SECTION) {
+ dest_sec = reloc->sym->sec;
+ dest_off = arch_dest_reloc_offset(reloc->addend);
+
+ } else if (reloc->sym->retpoline_thunk) {
+ return;
+
+ } else if (insn->func || reloc->sym->sec->idx) {
+ dest_sec = reloc->sym->sec;
+ dest_off = reloc->sym->offset +
+ arch_dest_reloc_offset(reloc->addend);
+ } else {
+ return;
+ }
+
+ insn->jump_dest = find_insn(file, dest_sec, dest_off);
+ if (!insn->jump_dest) {
+ if (!vmlinux && insn->func) {
+ dest_sec = insn->func->sec;
+ dest_off = insn->func->offset;
+ } else if (!strcmp(insn->sec->name, ".altinstr_replacement")) {
+ /* XXX: corner case unclear if fineibt-relevant */
+ return;
+ }
+ }
+
+ if (insn->jump_dest->type == INSN_ENDBR &&
+ is_fineibt_sequence(file, insn->jump_dest->sec,
+ dest_off)) {
+ if (reloc) {
+ reloc->addend += FINEIBT_FIXUP;
+ elf_write_reloc(file->elf, reloc);
+ return;
+ }
+ op = arch_mod_immediate(insn, dest_off + FINEIBT_FIXUP);
+ if (op) {
+ elf_write_insn(file->elf, insn->sec, insn->offset,
+ insn->len, op);
+ return;
+ }
+ if (nopout_jmp_target(file, insn)) {
+ WARN_FUNC("Can't fix direct jump to FineIBT",
+ insn->sec, insn->offset);
+ return;
+ } else {
+ WARN_FUNC("Can't fix direct jump to FineIBT",
+ insn->sec, insn->offset);
+ return;
+ }
+ }
+}
+
+static bool fix_fineibt_branches(struct objtool_file *file) {
+ struct instruction *insn;
+ struct section *sec;
+
+ for_each_sec(file, sec) {
+ if (!sec->text) {
+ continue;
+ }
+
+ sec_for_each_insn(file, sec, insn) {
+ if (insn->type == INSN_CALL) {
+ fix_fineibt_call(file, insn);
+ } else if (is_static_jump(insn)) {
+ fix_fineibt_jump(file, insn);
+ }
+ }
+ }
+ return 0;
+}
+
static void add_retpoline_call(struct objtool_file *file, struct instruction *insn)
{
/*
@@ -3859,6 +4027,11 @@ int check(struct objtool_file *file)
return 1;
}

+ if (fineibt && !ibt) {
+ fprintf(stderr, "--fineibt requires: --ibt\n");
+ return 1;
+ }
+
arch_initial_func_cfi_state(&initial_func_cfi);
init_cfi_state(&init_cfi);
init_cfi_state(&func_cfi);
@@ -3945,11 +4118,19 @@ int check(struct objtool_file *file)
warnings += ret;
}

+ if (fineibt) {
+ ret = fix_fineibt_branches(file);
+ if (ret < 0)
+ goto out;
+ warnings += ret;
+ }
+
if (stats) {
printf("nr_insns_visited: %ld\n", nr_insns_visited);
printf("nr_cfi: %ld\n", nr_cfi);
printf("nr_cfi_reused: %ld\n", nr_cfi_reused);
printf("nr_cfi_cache: %ld\n", nr_cfi_cache);
+ if (fineibt) printf("nr_fineibt_entries: %ld\n", nr_fineibt);
}

out:
diff --git a/tools/objtool/include/objtool/arch.h b/tools/objtool/include/objtool/arch.h
index 9b19cc304195..9c12770ccd13 100644
--- a/tools/objtool/include/objtool/arch.h
+++ b/tools/objtool/include/objtool/arch.h
@@ -28,6 +28,7 @@ enum insn_type {
INSN_CLD,
INSN_TRAP,
INSN_ENDBR,
+ INSN_SUB_R11,
INSN_OTHER,
};

@@ -85,6 +86,8 @@ unsigned long arch_dest_reloc_offset(int addend);

const char *arch_nop_insn(int len);
const char *arch_ret_insn(int len);
+const char *arch_mod_immediate(struct instruction *insn, unsigned long target);
+const char *arch_bypass_fineibt(void);

int arch_decode_hint_reg(u8 sp_reg, int *base);

diff --git a/tools/objtool/include/objtool/builtin.h b/tools/objtool/include/objtool/builtin.h
index c39dbfaef6dc..801eaeae1627 100644
--- a/tools/objtool/include/objtool/builtin.h
+++ b/tools/objtool/include/objtool/builtin.h
@@ -10,7 +10,7 @@
extern const struct option check_options[];
extern bool no_fp, no_unreachable, retpoline, module, backtrace, uaccess, stats,
lto, vmlinux, mcount, noinstr, backup, sls, dryrun,
- ibt;
+ ibt, fineibt;

extern int cmd_parse_options(int argc, const char **argv, const char * const usage[]);

--
2.35.1

2022-04-22 18:36:42

by Fangrui Song

[permalink] [raw]
Subject: Re: [RFC PATCH 00/11] Kernel FineIBT Support

On 2022-04-21, H.J. Lu wrote:
>On Thu, Apr 21, 2022 at 8:23 AM Joao Moreira <[email protected]> wrote:
>>
>> On 2022-04-21 00:49, Peter Zijlstra wrote:
>> > On Wed, Apr 20, 2022 at 03:40:41PM -0700, Joao Moreira wrote:
>> >> > >
>> >> > > If FineIBT needs it, I could reconsider. But I think there's a strong
>> >> > > case to be made that the linker should be doing that instead.
>> >> >
>> >> > That sounds reasonable to me (and reminds me of linker relaxation).
>> >> > Joao, can you please work with Fangrui (LLD) and HJ (GNU binutils) to
>> >> > determine how feasible this would be? I assume code outside the kernel
>> >> > might enjoy such an optimization, too. When that's the case, then it
>> >> > probably makes more sense to "upstream" such "optimizations" from the
>> >> > kernel-specific objtool into the toolchains.
>> >>
>> >> Alright, these are the greenlights I was hoping for.
>> >>
>> >> I went quickly into this with HJ and he mentioned that it should be
>> >> doable
>> >> in the linker, and that he has a patch for it in gcc (for local
>> >> function,
>> >> from what I could see):
>> >> https://gcc.gnu.org/pipermail/gcc-patches/2022-February/590832.html
>> >>
>> >> If @Fangrui is fine with it, I would like to try implementing this
>> >> myself in
>> >> lld (I'm still learning a lot about lld and having an actual problem
>> >> to
>> >> solve is the kind of fuel I need). Should take me a while, but I think
>> >> this
>> >> is not urgent, right? I can also go ahead and replicate HJ's gcc patch
>> >> into
>> >> clang, so we can also handle the local functions within the compiler
>> >> (I
>> >> think this makes a lot of sense).
>> >>
>> >> Once we have these in, I'll revisit FineIBT and extend the features to
>> >> handle the FineIBT instrumentation. Hopefully we'll be released from
>> >> needing
>> >> objtool (famous last words?!).
>> >>
>> >> This sounds like a plan, but I'm ofc open to suggestions or different
>> >> ideas/plans.

Thanks for looping me in! (I mean: this is discussed beforehand, instead
of GCC/GNU ld taking some steps and pushing LLVM toolchain to follow
suite..)
Though I think I may not have the full context here...

I can see that you have a request to skip the endbr instruction
(and potentially more instructions for FineIBT).

The GCC patch https://gcc.gnu.org/pipermail/gcc-patches/2022-February/590832.html
achieves it. A linker implementation will handle more cases.
This is somewhat similar to PowerPC64's global entry vs local entry.
The linker can redirect a branch instruction to the local entry in some
conditions. My concern is that inspecting the section content goes too far
and breaks the spirit of ELF relocation resolving. The proper way is to use
a symbol attribute (e.g. st_other).

st_other bits are scarce, so any use needs to be prudent.

>> > So trivially the plan sounds like: compiler fixes STB_LOCAL because it
>> > has the scope, and the linker fixes everything else. However, that
>> > seems
>> > to assume that !STB_LOCAL will have ENDBR.
>> >
>> > This latter isn't true; for one there's __attribute__((nocf_check))
>> > that
>> > can be used to suppress ENDBR generation on a function.
>> >
>> > Alternatively the linker will need to 'read' the function to determine
>> > if it has ENDBR, or we need to augment the ELF format such that we can
>> > tell from that.
>> >
>> > So what exactly is the plan?
>>
>> I ran into too many broken dreams by trying to infer the presence of
>> ENDBRs just by the symbol locality/linkage... not only because of the
>> attribute, but also because of ancient assembly.
>>
>> So, my first thought was to use something similar to the
>> __patchable_function_entries section
>> (https://man7.org/linux/man-pages/man1/gcc.1.html), where we would have
>> a section to mark all the placed ENDBR. But then it occurred to me that
>> if we follow that road we'll miss the ENDBR placed in assembly unless we
>> mark it manually, so I started thinking that reading the target
>> instructions from the ELF object could be a more simplified approach,
>> although a little more treacherous.
>>
>> I didn't decide yet what to try first -- any thoughts?
>>
>> @Fangrui's and @HJ's thoughts about this could be gold.
>
>You can't assume ENDBR existence just by symbol visibility.
>Compiler knows if there is an ENDBR at function entry since
>it is generated by compiler. Otherwise, you need to check
>the first 4 bytes at function entry,
>
>--
>H.J.

2022-04-22 19:44:55

by Joao Moreira

[permalink] [raw]
Subject: [RFC PATCH 02/11] kbuild: Support FineIBT build

From: Joao Moreira <[email protected]>

Add FineIBT compilation flags to Makefiles, preserving translation
units which should not get it.

Signed-off-by: Joao Moreira <[email protected]>
---
arch/x86/Kconfig | 10 ++++++++++
arch/x86/Makefile | 3 +++
arch/x86/entry/vdso/Makefile | 5 +++++
arch/x86/kernel/Makefile | 1 +
arch/x86/purgatory/Makefile | 4 ++++
5 files changed, 23 insertions(+)

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index b0142e01002e..37e49e9187a0 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -1872,6 +1872,16 @@ config X86_KERNEL_IBT
does significantly reduce the number of ENDBR instructions in the
kernel image.

+config CC_HAS_FINEIBT
+ def_bool $(cc-option, -fcf-protection=branch -mfine-ibt) && $(as-instr,endbr64)
+
+config X86_KERNEL_FINEIBT
+ prompt "Fine-grain Indirect Branch Tracking"
+ bool
+ depends on X86_KERNEL_IBT && CC_HAS_FINEIBT
+ help
+ Build the kernel with Fine-grained IBT.
+
config X86_INTEL_MEMORY_PROTECTION_KEYS
prompt "Memory Protection Keys"
def_bool y
diff --git a/arch/x86/Makefile b/arch/x86/Makefile
index 63d50f65b828..768e318eb78f 100644
--- a/arch/x86/Makefile
+++ b/arch/x86/Makefile
@@ -73,6 +73,9 @@ ifeq ($(CONFIG_X86_KERNEL_IBT),y)
# https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104816
#
KBUILD_CFLAGS += $(call cc-option,-fcf-protection=branch -fno-jump-tables)
+ifeq ($(CONFIG_X86_KERNEL_FINEIBT),y)
+KBUILD_CFLAGS += $(call cc-option, -mfine-ibt)
+endif
else
KBUILD_CFLAGS += $(call cc-option,-fcf-protection=none)
endif
diff --git a/arch/x86/entry/vdso/Makefile b/arch/x86/entry/vdso/Makefile
index 693f8b9031fb..3dce5571460e 100644
--- a/arch/x86/entry/vdso/Makefile
+++ b/arch/x86/entry/vdso/Makefile
@@ -91,7 +91,11 @@ ifneq ($(RETPOLINE_VDSO_CFLAGS),)
endif
endif

+ifdef CONFIG_X86_KERNEL_FINEIBT
+$(vobjs): KBUILD_CFLAGS := $(filter-out $(CC_FLAGS_LTO) $(GCC_PLUGINS_CFLAGS) $(RETPOLINE_CFLAGS) -mfine-ibt,$(KBUILD_CFLAGS)) $(CFL)
+else
$(vobjs): KBUILD_CFLAGS := $(filter-out $(CC_FLAGS_LTO) $(GCC_PLUGINS_CFLAGS) $(RETPOLINE_CFLAGS),$(KBUILD_CFLAGS)) $(CFL)
+endif

#
# vDSO code runs in userspace and -pg doesn't help with profiling anyway.
@@ -151,6 +155,7 @@ KBUILD_CFLAGS_32 := $(filter-out -mfentry,$(KBUILD_CFLAGS_32))
KBUILD_CFLAGS_32 := $(filter-out $(GCC_PLUGINS_CFLAGS),$(KBUILD_CFLAGS_32))
KBUILD_CFLAGS_32 := $(filter-out $(RETPOLINE_CFLAGS),$(KBUILD_CFLAGS_32))
KBUILD_CFLAGS_32 := $(filter-out $(CC_FLAGS_LTO),$(KBUILD_CFLAGS_32))
+KBUILD_CFLAGS_32 := $(filter-out -mfine-ibt,$(KBUILD_CFLAGS_32))
KBUILD_CFLAGS_32 += -m32 -msoft-float -mregparm=0 -fpic
KBUILD_CFLAGS_32 += -fno-stack-protector
KBUILD_CFLAGS_32 += $(call cc-option, -foptimize-sibling-calls)
diff --git a/arch/x86/kernel/Makefile b/arch/x86/kernel/Makefile
index c41ef42adbe8..cb947569e9d8 100644
--- a/arch/x86/kernel/Makefile
+++ b/arch/x86/kernel/Makefile
@@ -148,6 +148,7 @@ obj-$(CONFIG_UNWINDER_FRAME_POINTER) += unwind_frame.o
obj-$(CONFIG_UNWINDER_GUESS) += unwind_guess.o

obj-$(CONFIG_AMD_MEM_ENCRYPT) += sev.o
+obj-$(CONFIG_X86_KERNEL_FINEIBT) += fineibt.o

###
# 64 bit specific files
diff --git a/arch/x86/purgatory/Makefile b/arch/x86/purgatory/Makefile
index ae53d54d7959..e16b25a598ba 100644
--- a/arch/x86/purgatory/Makefile
+++ b/arch/x86/purgatory/Makefile
@@ -55,6 +55,10 @@ ifdef CONFIG_RETPOLINE
PURGATORY_CFLAGS_REMOVE += $(RETPOLINE_CFLAGS)
endif

+ifdef CONFIG_X86_KERNEL_FINEIBT
+PURGATORY_CFLAGS_REMOVE += -mfine-ibt
+endif
+
CFLAGS_REMOVE_purgatory.o += $(PURGATORY_CFLAGS_REMOVE)
CFLAGS_purgatory.o += $(PURGATORY_CFLAGS)

--
2.35.1

2022-04-22 19:53:18

by Nick Desaulniers

[permalink] [raw]
Subject: Re: [RFC PATCH 00/11] Kernel FineIBT Support

On Wed, Apr 20, 2022 at 8:17 AM Josh Poimboeuf <[email protected]> wrote:
>
> On Wed, Apr 20, 2022 at 09:40:44AM +0200, Peter Zijlstra wrote:
> > On Tue, Apr 19, 2022 at 05:42:30PM -0700, [email protected] wrote:
> > > @PeterZ @JoshP
> > >
> > > I'm a bit unaware of the details on why the objtool approach to bypass ENDBRs
> > > was removed from the IBT series. Is this approach now sensible considering that
> > > it is a requirement for a new/enhanced feature? If not, how extending the Linker
> > > to emit already fixed offsets sounds like?
> >
> > Josh hates objtool modifying actualy code. He much prefers objtool only
> > emits out of band data.
> >
> > Now, I did sneak in that jump_label nop'ing, and necessity (broken
> > compilers) had us do the KCOV nop'ing in noinstr, but if you look at the
> > recent objtool series here:
> >
> > https://lkml.kernel.org/r/[email protected]
> >
> > you'll see his thoughs on that :-)
> >
> > Now, I obviously don't mind, it's easy enough to figure out what objtool
> > actually does with something like:
> >
> > $ OBJTOOL_ARGS="--backup" make O=ibt-build/ -j$lots vmlinux
> > $ objdiff.sh ibt-build/vmlinux.o
> >
> > Where objdiff.sh is the below crummy script.
> >
> > Now, one compromise that I did get out of Josh was that he objected less
> > to rewriting relocations than to rewriting the immediates. From my
> > testing the relocations got us the vast majority of direct call sites,
> > very few are immediates.
> >
> > Josh, any way you might reconsider all that? :-)
>
> If I remember correctly, the goal of --ibt-fix-direct was to avoid
> hitting unnecessary ENDBRs, which basically decode to NOPs, so the
> ghastly hack wasn't worth it.
>
> If FineIBT needs it, I could reconsider. But I think there's a strong
> case to be made that the linker should be doing that instead.

That sounds reasonable to me (and reminds me of linker relaxation).
Joao, can you please work with Fangrui (LLD) and HJ (GNU binutils) to
determine how feasible this would be? I assume code outside the kernel
might enjoy such an optimization, too. When that's the case, then it
probably makes more sense to "upstream" such "optimizations" from the
kernel-specific objtool into the toolchains.
--
Thanks,
~Nick Desaulniers

2022-04-22 20:09:08

by Joao Moreira

[permalink] [raw]
Subject: Re: [RFC PATCH 00/11] Kernel FineIBT Support

> I think it'd be good to get kCFI landed in Clang first (since it is
> effectively architecture agnostic), and then get FineIBT landed. But
> that doesn't mean we can't be working on the kernel side of things at
> the same time.

FWIIW, I'm effectively taking some time away from work for the next 3
months. I'll be around to answer this and that, help reviewing KCFI and
maybe send small fixes around, but I'm not planning to land FineIBT in
clang anytime before that (specially now that I have a direction to look
into the linker approach as per the other thread e-mails). This should
give KCFI the time it needs to squeeze in.

>
> And just thinking generally, for other architecture-specific stuff,
> I do wonder what an arm64 PAC-based CFI might look like. I prefer
> things
> be hard-coded as kCFI is doing, but it'd be nice to be able to directly
> measure performance and size overheads comparing the various methods.

There are other important bullets to this list, I think, like power
consumption, robustness and collateral gains (like IBT's side-channel
hardening). But yeah, this is probably a good list to keep in mind for
us to discuss during plumbers :)

2022-04-22 21:04:20

by Joao Moreira

[permalink] [raw]
Subject: [RFC PATCH 06/11] x86/bpf: Support FineIBT

From: Joao Moreira <[email protected]>

BPF jitted code calls helper functions that are in the core and contain a
FineIBT hash check sequence in their prologue. Make BPF jit capable of
identifying FineIBT sequences when emitting calls and properly sum the
offset to bypass it when emitting calls.

Signed-off-by: Joao Moreira <[email protected]>
Tinkered-from-patches-by: Peter Zijlstra <[email protected]>
---
arch/x86/net/bpf_jit_comp.c | 31 +++++++++++++++++++++++++++++++
1 file changed, 31 insertions(+)

diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c
index 16b6efacf7c6..e0c82174a075 100644
--- a/arch/x86/net/bpf_jit_comp.c
+++ b/arch/x86/net/bpf_jit_comp.c
@@ -330,13 +330,44 @@ static int emit_patch(u8 **pprog, void *func, void *ip, u8 opcode)
return 0;
}

+static inline bool skip_fineibt_sequence(void *func)
+{
+ const void *addr = (void *) func;
+ union text_poke_insn text;
+ u32 insn;
+
+ if ((get_kernel_nofault(insn, addr)) ||
+ (!is_endbr(insn)))
+ return false;
+
+ if ((get_kernel_nofault(text, addr+4)) ||
+ (text.opcode != SUB_INSN_OPCODE))
+ return false;
+
+ if ((get_kernel_nofault(text, addr+11)) ||
+ (text.opcode != JE_INSN_OPCODE))
+ return false;
+
+ if ((get_kernel_nofault(text, addr+13)) ||
+ (text.opcode != CALL_INSN_OPCODE))
+ return false;
+
+ return true;
+}
+
static int emit_call(u8 **pprog, void *func, void *ip)
{
+#ifdef CONFIG_X86_KERNEL_FINEIBT
+ if(skip_fineibt_sequence(func)) func = func + FINEIBT_FIXUP;
+#endif
return emit_patch(pprog, func, ip, 0xE8);
}

static int emit_jump(u8 **pprog, void *func, void *ip)
{
+#ifdef CONFIG_X86_KERNEL_FINEIBT
+ if(skip_fineibt_sequence(func)) func = func + FINEIBT_FIXUP;
+#endif
return emit_patch(pprog, func, ip, 0xE9);
}

--
2.35.1

2022-04-22 21:23:42

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [RFC PATCH 00/11] Kernel FineIBT Support

On Wed, Apr 20, 2022 at 03:40:41PM -0700, Joao Moreira wrote:
> > >
> > > If FineIBT needs it, I could reconsider. But I think there's a strong
> > > case to be made that the linker should be doing that instead.
> >
> > That sounds reasonable to me (and reminds me of linker relaxation).
> > Joao, can you please work with Fangrui (LLD) and HJ (GNU binutils) to
> > determine how feasible this would be? I assume code outside the kernel
> > might enjoy such an optimization, too. When that's the case, then it
> > probably makes more sense to "upstream" such "optimizations" from the
> > kernel-specific objtool into the toolchains.
>
> Alright, these are the greenlights I was hoping for.
>
> I went quickly into this with HJ and he mentioned that it should be doable
> in the linker, and that he has a patch for it in gcc (for local function,
> from what I could see):
> https://gcc.gnu.org/pipermail/gcc-patches/2022-February/590832.html
>
> If @Fangrui is fine with it, I would like to try implementing this myself in
> lld (I'm still learning a lot about lld and having an actual problem to
> solve is the kind of fuel I need). Should take me a while, but I think this
> is not urgent, right? I can also go ahead and replicate HJ's gcc patch into
> clang, so we can also handle the local functions within the compiler (I
> think this makes a lot of sense).
>
> Once we have these in, I'll revisit FineIBT and extend the features to
> handle the FineIBT instrumentation. Hopefully we'll be released from needing
> objtool (famous last words?!).
>
> This sounds like a plan, but I'm ofc open to suggestions or different
> ideas/plans.

So trivially the plan sounds like: compiler fixes STB_LOCAL because it
has the scope, and the linker fixes everything else. However, that seems
to assume that !STB_LOCAL will have ENDBR.

This latter isn't true; for one there's __attribute__((nocf_check)) that
can be used to suppress ENDBR generation on a function.

Alternatively the linker will need to 'read' the function to determine
if it has ENDBR, or we need to augment the ELF format such that we can
tell from that.

So what exactly is the plan?

2022-04-22 21:34:08

by Joao Moreira

[permalink] [raw]
Subject: Re: [RFC PATCH 00/11] Kernel FineIBT Support

On 2022-04-21 00:49, Peter Zijlstra wrote:
> On Wed, Apr 20, 2022 at 03:40:41PM -0700, Joao Moreira wrote:
>> > >
>> > > If FineIBT needs it, I could reconsider. But I think there's a strong
>> > > case to be made that the linker should be doing that instead.
>> >
>> > That sounds reasonable to me (and reminds me of linker relaxation).
>> > Joao, can you please work with Fangrui (LLD) and HJ (GNU binutils) to
>> > determine how feasible this would be? I assume code outside the kernel
>> > might enjoy such an optimization, too. When that's the case, then it
>> > probably makes more sense to "upstream" such "optimizations" from the
>> > kernel-specific objtool into the toolchains.
>>
>> Alright, these are the greenlights I was hoping for.
>>
>> I went quickly into this with HJ and he mentioned that it should be
>> doable
>> in the linker, and that he has a patch for it in gcc (for local
>> function,
>> from what I could see):
>> https://gcc.gnu.org/pipermail/gcc-patches/2022-February/590832.html
>>
>> If @Fangrui is fine with it, I would like to try implementing this
>> myself in
>> lld (I'm still learning a lot about lld and having an actual problem
>> to
>> solve is the kind of fuel I need). Should take me a while, but I think
>> this
>> is not urgent, right? I can also go ahead and replicate HJ's gcc patch
>> into
>> clang, so we can also handle the local functions within the compiler
>> (I
>> think this makes a lot of sense).
>>
>> Once we have these in, I'll revisit FineIBT and extend the features to
>> handle the FineIBT instrumentation. Hopefully we'll be released from
>> needing
>> objtool (famous last words?!).
>>
>> This sounds like a plan, but I'm ofc open to suggestions or different
>> ideas/plans.
>
> So trivially the plan sounds like: compiler fixes STB_LOCAL because it
> has the scope, and the linker fixes everything else. However, that
> seems
> to assume that !STB_LOCAL will have ENDBR.
>
> This latter isn't true; for one there's __attribute__((nocf_check))
> that
> can be used to suppress ENDBR generation on a function.
>
> Alternatively the linker will need to 'read' the function to determine
> if it has ENDBR, or we need to augment the ELF format such that we can
> tell from that.
>
> So what exactly is the plan?

I ran into too many broken dreams by trying to infer the presence of
ENDBRs just by the symbol locality/linkage... not only because of the
attribute, but also because of ancient assembly.

So, my first thought was to use something similar to the
__patchable_function_entries section
(https://man7.org/linux/man-pages/man1/gcc.1.html), where we would have
a section to mark all the placed ENDBR. But then it occurred to me that
if we follow that road we'll miss the ENDBR placed in assembly unless we
mark it manually, so I started thinking that reading the target
instructions from the ELF object could be a more simplified approach,
although a little more treacherous.

I didn't decide yet what to try first -- any thoughts?

@Fangrui's and @HJ's thoughts about this could be gold.

2022-04-22 21:59:56

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [RFC PATCH 00/11] Kernel FineIBT Support

On Tue, Apr 19, 2022 at 05:42:30PM -0700, [email protected] wrote:
> @PeterZ @JoshP
>
> I'm a bit unaware of the details on why the objtool approach to bypass ENDBRs
> was removed from the IBT series. Is this approach now sensible considering that
> it is a requirement for a new/enhanced feature? If not, how extending the Linker
> to emit already fixed offsets sounds like?

Josh hates objtool modifying actualy code. He much prefers objtool only
emits out of band data.

Now, I did sneak in that jump_label nop'ing, and necessity (broken
compilers) had us do the KCOV nop'ing in noinstr, but if you look at the
recent objtool series here:

https://lkml.kernel.org/r/[email protected]

you'll see his thoughs on that :-)

Now, I obviously don't mind, it's easy enough to figure out what objtool
actually does with something like:

$ OBJTOOL_ARGS="--backup" make O=ibt-build/ -j$lots vmlinux
$ objdiff.sh ibt-build/vmlinux.o

Where objdiff.sh is the below crummy script.

Now, one compromise that I did get out of Josh was that he objected less
to rewriting relocations than to rewriting the immediates. From my
testing the relocations got us the vast majority of direct call sites,
very few are immediates.

Josh, any way you might reconsider all that? :-)

---
#!/bin/bash

name=$1
pre=${name}.orig
post=${name}

function to_text {
obj=$1
( objdump -wdr $obj;
readelf -W --relocs --symbols $obj |
awk '/^Relocation section/ { $6=0 } { print $0 }'
) > ${obj}.tmp
}

to_text $pre
to_text $post

diff -u ${pre}.tmp ${post}.tmp

rm ${pre}.tmp ${post}.tmp

2022-04-22 22:15:18

by Joao Moreira

[permalink] [raw]
Subject: [RFC PATCH 11/11] driver/int3400_thermal: Fix prototype matching

From: Joao Moreira <[email protected]>

The function attr_dev_show directly invokes functions from drivers
expecting an specific prototype. The driver for int3400_thermal
implements the given show function using a different prototype than what
is expected. This violates the prototype-based fine-grained CFI policy.

Make the function prototype compliant and cast the respective assignement
so it can be properly user together with fine-grained CFI.

(FWIIW, there should be a less ugly patch for this, but I don't know
enough about the touched source code).

Signed-off-by: Joao Moreira <[email protected]>
---
.../thermal/intel/int340x_thermal/int3400_thermal.c | 10 +++++++---
1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/drivers/thermal/intel/int340x_thermal/int3400_thermal.c b/drivers/thermal/intel/int340x_thermal/int3400_thermal.c
index 4954800b9850..4bd95a2016b7 100644
--- a/drivers/thermal/intel/int340x_thermal/int3400_thermal.c
+++ b/drivers/thermal/intel/int340x_thermal/int3400_thermal.c
@@ -311,12 +311,13 @@ static int int3400_thermal_get_uuids(struct int3400_thermal_priv *priv)
return result;
}

-static ssize_t odvp_show(struct kobject *kobj, struct kobj_attribute *attr,
+static ssize_t odvp_show(struct device *kobj, struct device_attribute *attr,
char *buf)
{
+ struct kobj_attribute *kattr = (struct kobj_attribute *) attr;
struct odvp_attr *odvp_attr;

- odvp_attr = container_of(attr, struct odvp_attr, attr);
+ odvp_attr = container_of(kattr, struct odvp_attr, attr);

return sprintf(buf, "%d\n", odvp_attr->priv->odvp[odvp_attr->odvp]);
}
@@ -388,7 +389,10 @@ static int evaluate_odvp(struct int3400_thermal_priv *priv)
goto out_err;
}
odvp->attr.attr.mode = 0444;
- odvp->attr.show = odvp_show;
+ odvp->attr.show = (ssize_t (*)
+ (struct kobject *,
+ struct kobj_attribute *,
+ char *)) odvp_show;
odvp->attr.store = NULL;
ret = sysfs_create_file(&priv->pdev->dev.kobj,
&odvp->attr.attr);
--
2.35.1

2022-04-22 22:37:39

by Joao Moreira

[permalink] [raw]
Subject: Re: [RFC PATCH 10/11] linux/interrupt: Fix prototype matching property

>> Fix this CFI policy violation by removing the function pointer union
>> in
>> the tasklet struct.
>
> The good news is that tasklet is on the way out the door[1], so this
> may
> quickly become a non-issue, but also to that end, this fix is hardly a
> problem for a deprecated API...

You are right, sorry for the noise. I looked a bit further and the
problem I saw was actually caused by a compiler bug fusing similar
instructions/basic blocks. It was fixed when I later stumbled on the
problem again and added the following lines (668 and 669 in
llvm/lib/CodeGen/MachineInstr.cpp) to the compiler, but without properly
realizing what was actually behind the previous issue. Hopefully this is
at least a good heads-up about possible pitfalls to other people (@Sami)
implementing CFI in the compiler.

https://github.com/lvwr/llvm-project/commit/0a22ca42877fd156ce95145b11f29c642092dbb7#diff-92843a1f037a9a1e56f92242c4e1746a1166a6b7044ad47a0b4fd2f4b1c6a359R668-R669

2022-04-29 02:47:58

by Josh Poimboeuf

[permalink] [raw]
Subject: Re: [RFC PATCH 01/11] x86: kernel FineIBT

On Tue, Apr 19, 2022 at 05:42:31PM -0700, [email protected] wrote:
> +void __noendbr __fineibt_handler(void){
> + unsigned i;
> + unsigned long flags;
> + bool skip;
> + void * ret;
> + void * caller;
> +
> + DO_ALL_PUSHS;

So this function isn't C ABI compliant, right? e.g. the compiler just
calls the handler without regard for preserving registers?

If this function is going to be implemented in C, it should probably
have an asm thunk wrapper which can properly save/restore the registers
before calling into the C version.

Even better, if the compiler did an invalid op (UD2?), which I think you
mentioned elsewhere, instead of calling the handler directly, and there
were a way for the trap code to properly detect it as a FineIBT
violation, we could get rid of the pushes/pops, plus the uaccess objtool
warning from patch 7, plus I'm guessing a bunch of noinstr validation
warnings.

> +
> + spin_lock_irqsave(&fineibt_lock, flags);
> + skip = false;
> +
> + asm("\t movq 0x90(%%rsp),%0" : "=r"(ret));
> + asm("\t movq 0x98(%%rsp),%0" : "=r"(caller));

This is making some questionable assumptions about the stack layout.

I assume this function is still in the prototype stage ;-)

> + if(!skip) {
> + printk("FineIBT violation: %px:%px:%u\n", ret, caller,
> + vlts_next);
> + }
> + DO_ALL_POPS;
> +}

Right now this handler just does a printk if it hasn't already for this
caller/callee combo, and then resumes control. Which is fine for
debugging, but it really needs to behave similarly to an IBT violation,
by panicking unless "ibt=warn" on the cmdline.

Not sure what would happen for "ibt=off"? Maybe apply_ibt_endbr() could
NOP out all the FineIBT stuff.

--
Josh

2022-05-02 23:39:51

by Joao Moreira

[permalink] [raw]
Subject: Re: [RFC PATCH 01/11] x86: kernel FineIBT

On 2022-04-28 18:37, Josh Poimboeuf wrote:
> On Tue, Apr 19, 2022 at 05:42:31PM -0700, [email protected]
> wrote:
>> +void __noendbr __fineibt_handler(void){
>> + unsigned i;
>> + unsigned long flags;
>> + bool skip;
>> + void * ret;
>> + void * caller;
>> +
>> + DO_ALL_PUSHS;
>
> So this function isn't C ABI compliant, right? e.g. the compiler just
> calls the handler without regard for preserving registers?
>
> If this function is going to be implemented in C, it should probably
> have an asm thunk wrapper which can properly save/restore the registers
> before calling into the C version.
>
> Even better, if the compiler did an invalid op (UD2?), which I think
> you
> mentioned elsewhere, instead of calling the handler directly, and there
> were a way for the trap code to properly detect it as a FineIBT
> violation, we could get rid of the pushes/pops, plus the uaccess
> objtool
> warning from patch 7, plus I'm guessing a bunch of noinstr validation
> warnings.

Cool, I'll try to come up with something!

>
>> +
>> + spin_lock_irqsave(&fineibt_lock, flags);
>> + skip = false;
>> +
>> + asm("\t movq 0x90(%%rsp),%0" : "=r"(ret));
>> + asm("\t movq 0x98(%%rsp),%0" : "=r"(caller));
>
> This is making some questionable assumptions about the stack layout.
>
> I assume this function is still in the prototype stage ;-)

Yeah, this is just a messy instrumentation to get reports about
mismatching prototypes (as the ones reported further down the series).

The issue with having the call is that it bloats the binary, so the ud2
is 3-bytes-per-function better. Yet, we may consider a FINEIBT_DEBUG
config, which can then enable a handler. This would be useful together
with a fuzzer or a stress tool to cover possible control-flow paths
within the kernel and map mismatching prototypes more properly I guess.

>
>> + if(!skip) {
>> + printk("FineIBT violation: %px:%px:%u\n", ret, caller,
>> + vlts_next);
>> + }
>> + DO_ALL_POPS;
>> +}
>
> Right now this handler just does a printk if it hasn't already for this
> caller/callee combo, and then resumes control. Which is fine for
> debugging, but it really needs to behave similarly to an IBT violation,
> by panicking unless "ibt=warn" on the cmdline.
>
> Not sure what would happen for "ibt=off"? Maybe apply_ibt_endbr()
> could
> NOP out all the FineIBT stuff.

Either that, or...

I'm thinking about a way to have FineIBT interchangeable with KCFI.
Currently KCFI adds a 4 byte hash + 2 byte nops before function entry,
to allow for proper prototype checking. After that, there should be an
ENDBR of 4 bytes. This gives us 10 bytes in total. Then, my yet to be
properly thought idea would be patch these 10 bytes with:

endbr
call fineibt_handler_<$HASH>
nop

and then, on the caller side, patch the "cmp <$HASH>, -0x6(%r11); je;
ud2; call" sequence with a "sub 0x6, r11; mov $HASH, %r10; call %r11,
add 0x6 %r11". This would then allow the kernel to verify if the CPU is
IBT capable on boot time and only then setting the proper scheme.

The downsides of having something like this would be that this sub
r11/add r11 sequence is kinda meh. We can avoid that by having two
padding nops after the original ENDBR, which will be skipped when the
function is reached directly by the linker optimization I'm working on,
and that we can convert into a JMP -offset that makes control flow reach
the padding area before the prologue and from where we can call the
fineibt_handler function. The resulting instrumentation would be
something like:

1:
call fineibt_handler_<$HASH>
jmp 2f
<foo>
endbr
jmp 1b
2:

Also, it would prevent a paranoid user to have both schemes
simultaneously (there are reasons why people could want that).

Any thoughts?

2022-05-04 08:53:03

by Joao Moreira

[permalink] [raw]
Subject: Re: [RFC PATCH 01/11] x86: kernel FineIBT

>
> It should be possible to have a non-fatal #UD2 handler.
>
> See for example how WARN() is implemented with __WARN_FLAGS in
> arch/x86/include/asm/bug.h .
>
> So hopefully we can just get rid of the need for the "call handler"
> thing altogether.
>
Nice, I'll look into it. Tks.

>> > Not sure what would happen for "ibt=off"? Maybe apply_ibt_endbr() could
>> > NOP out all the FineIBT stuff.
>>
>> Either that, or...
>>
>> I'm thinking about a way to have FineIBT interchangeable with KCFI.
>> Currently KCFI adds a 4 byte hash + 2 byte nops before function entry,
>> to
>> allow for proper prototype checking. After that, there should be an
>> ENDBR of
>> 4 bytes. This gives us 10 bytes in total. Then, my yet to be properly
>> thought idea would be patch these 10 bytes with:
>>
>> endbr
>> call fineibt_handler_<$HASH>
>> nop
>>
>> and then, on the caller side, patch the "cmp <$HASH>, -0x6(%r11); je;
>> ud2;
>> call" sequence with a "sub 0x6, r11; mov $HASH, %r10; call %r11, add
>> 0x6
>> %r11". This would then allow the kernel to verify if the CPU is IBT
>> capable
>> on boot time and only then setting the proper scheme.
>>
>> The downsides of having something like this would be that this sub
>> r11/add
>> r11 sequence is kinda meh. We can avoid that by having two padding
>> nops
>> after the original ENDBR, which will be skipped when the function is
>> reached
>> directly by the linker optimization I'm working on, and that we can
>> convert
>> into a JMP -offset that makes control flow reach the padding area
>> before the
>> prologue and from where we can call the fineibt_handler function. The
>> resulting instrumentation would be something like:
>>
>> 1:
>> call fineibt_handler_<$HASH>
>> jmp 2f
>> <foo>
>> endbr
>> jmp 1b
>> 2:
>>
>> Also, it would prevent a paranoid user to have both schemes
>> simultaneously
>> (there are reasons why people could want that).
>>
>> Any thoughts?
>
> I'm not really qualified to comment on this too directly since I
> haven't
> looked very much at the variations on FineIBT/CFI/KCFI, and what the
> protections and drawbacks are for each approach, and when it might even
> make sense to combine them for a "paranoid user".
>
> Since we have multiple similar and possibly competing technologies
> being
> discussed, one thing I do want to warn against is that we as kernel
> developers tend to err on the side of giving people too many choices
> and
> combinations which *never* get used.
>
> All those unused options can confuse the user and significantly add to
> complexity and maintenance overhead for us. Especially for invasive
> features like these.
>
> (Just to be clear, I'm not saying that's happening here, but it's
> something we need to be careful about.)
>
> Here, documentation is going to be crucial, for both reviewers and
> users. Something that describes when/why I should use X or Y or X+Y.
>
> If we truly want to add more options/combos for different use cases
> then
> we'll also need clear and concise documentation about which
> options/combos would be used under what circumstances.

Yeah, I totally understand/support this concern and I feel the same way.
While, in this case, I can't see super clear reasons for X+Y, there are
aspects why someone could prefer X or Y -- so I think that using
alternatives to flip the instrumentation is a valid consideration. In
time, taking the chance to be fair on the credits, using alternatives to
replace KCFI/FineIBT was also Peter's idea, not mine. It looked hard to
do at first sight because of the caller/callee-side checks differences,
but since Peter mentioned it, I started trying to solve the puzzle of
having the best suitable instrumentation that would be changeable. I
haven't discussed this with anyone yet, but at this point I think it
might be doable, although not in the most performant shape. Anyway, I'll
post something here once I have a more solid idea.

And yes, I agree that documentation will be key and I totally see your
point/understand how confusing I was in my previous e-mail. I'll keep
that in mind for the next revision. Thanks for pointing it out :)

2022-05-04 15:19:59

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [RFC PATCH 01/11] x86: kernel FineIBT

On Tue, May 03, 2022 at 03:02:44PM -0700, Josh Poimboeuf wrote:

> I'm not really qualified to comment on this too directly since I haven't
> looked very much at the variations on FineIBT/CFI/KCFI, and what the
> protections and drawbacks are for each approach, and when it might even
> make sense to combine them for a "paranoid user".
>
> Since we have multiple similar and possibly competing technologies being
> discussed, one thing I do want to warn against is that we as kernel
> developers tend to err on the side of giving people too many choices and
> combinations which *never* get used.

So I don't think there's going to be a user choice here. If there's
hardware support, FineIBT makes more sense. That also means that kCFI no
longer needs to worry about IBT.

If we do something like:


kCFI FineIBT

__cfi_\sym: __cfi_\sym:
endbr # 4 endbr # 4
sub $hash, %r10 # 7 sub $hash, %r10 # 7
je \sym # 2 je \sym # 2
ud2 # 2 ud2 # 2
\sym: \sym:


caller: caller:
cmpl $hash, -8(%r11) # 8 movl $hash, %r10d # 6
je 1f # 2 sub 15, %r11 # 4
ud2 # 2 call *%r11 # 3
1: call __x86_indirect_thunk_r11 # 5 .nop 4 # 4 (could even fix up r11 again)


Then, all that's required is a slight tweak to apply_retpolines() to
rewrite a little more text.

Note that this also does away with having to fix up the linker, since
all direct call will already point at \sym. It's just the IBT indirect
calls that need to frob the pointer in order to hit the ENDBR.

On top of that, we no longer have to special case the objtool
instruction decoder, the prelude are proper instructions now.

2022-05-05 14:25:39

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [RFC PATCH 01/11] x86: kernel FineIBT

On Wed, May 04, 2022 at 10:04:02AM -0700, Peter Collingbourne wrote:
> On Wed, May 04, 2022 at 12:20:19PM +0200, Peter Zijlstra wrote:
> > On Tue, May 03, 2022 at 03:02:44PM -0700, Josh Poimboeuf wrote:
> >
> > > I'm not really qualified to comment on this too directly since I haven't
> > > looked very much at the variations on FineIBT/CFI/KCFI, and what the
> > > protections and drawbacks are for each approach, and when it might even
> > > make sense to combine them for a "paranoid user".
> > >
> > > Since we have multiple similar and possibly competing technologies being
> > > discussed, one thing I do want to warn against is that we as kernel
> > > developers tend to err on the side of giving people too many choices and
> > > combinations which *never* get used.
> >
> > So I don't think there's going to be a user choice here. If there's
> > hardware support, FineIBT makes more sense. That also means that kCFI no
> > longer needs to worry about IBT.
> >
> > If we do something like:
> >
> >
> > kCFI FineIBT
> >
> > __cfi_\sym: __cfi_\sym:
> > endbr # 4 endbr # 4
> > sub $hash, %r10 # 7 sub $hash, %r10 # 7
> > je \sym # 2 je \sym # 2
> > ud2 # 2 ud2 # 2
> > \sym: \sym:
> >
> >
> > caller: caller:
> > cmpl $hash, -8(%r11) # 8 movl $hash, %r10d # 6
> > je 1f # 2 sub 15, %r11 # 4
> > ud2 # 2 call *%r11 # 3
> > 1: call __x86_indirect_thunk_r11 # 5 .nop 4 # 4 (could even fix up r11 again)
> >
> >
> > Then, all that's required is a slight tweak to apply_retpolines() to
> > rewrite a little more text.
> >
> > Note that this also does away with having to fix up the linker, since
> > all direct call will already point at \sym. It's just the IBT indirect
> > calls that need to frob the pointer in order to hit the ENDBR.
> >
> > On top of that, we no longer have to special case the objtool
> > instruction decoder, the prelude are proper instructions now.
>
> For kCFI this brings back the gadget problem that I mentioned here:
> https://lore.kernel.org/all/[email protected]/
>
> because the hash at the call site is 8 bytes before the call
> instruction.

Damn, I forgot about that. Too subtle :-/

So Joao had another crazy idea, lemme diagram that to see if it works.

(sorry I inverted the order by accident)


FineIBT kCFI

__fineibt_\hash:
xor \hash, %r10 # 7
jz 1f # 2
ud2 # 2
1: ret # 1
int3 # 1


__cfi_\sym: __cfi_\sym:
int3; int3 # 2
endbr # 4 mov \hash, %eax # 5
call __fineibt_\hash # 5 int3; int3 # 2
\sym: \sym:
... ...


caller: caller:
movl \hash, %r10d # 6 cmpl \hash, -6(%r11) # 8
sub $9, %r11 # 4 je 1f # 2
call *%r11 # 3 ud2 # 2
.nop 4 # 4 (or fixup r11) call __x86_indirect_thunk_r11 # 5


This way we also need to patch the __cfi_\sym contents, but we get a
little extra room to place the constant for kCFI in a suitable location.

It seems to preserve the properties of the last one in that direct calls
will already be correct and we don't need linker fixups, and objtool can
simply parse the preamble as regular instructions without needing
further help.

2022-05-05 16:05:09

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [RFC PATCH 01/11] x86: kernel FineIBT

On Wed, May 04, 2022 at 05:28:57PM -0700, Sami Tolvanen wrote:
> On Wed, May 4, 2022 at 11:17 AM Peter Zijlstra <[email protected]> wrote:
> > __cfi_\sym: __cfi_\sym:
> > int3; int3 # 2
> > endbr # 4 mov \hash, %eax # 5
> > call __fineibt_\hash # 5 int3; int3 # 2
> > \sym: \sym:
>
> OK, that looks reasonable to me.
>
> > It seems to preserve the properties of the last one in that direct calls
> > will already be correct and we don't need linker fixups, and objtool can
> > simply parse the preamble as regular instructions without needing
> > further help.
>
> Wouldn't objtool still print out unreachable instruction warnings here?

Depends a bit on what kind of symbol they end up being, if they're
STT_FUNC we'll probably get the complaint that it falls through into the
next symbol, while if they're STT_NOTYPE then yes, we'll get the
unreachable thing.

So either way we need to special case the __cfi_\sym things anyway.

But that should be relatively straight forward. I think I would lean
towards making then STT_FUNC (they really are for FineIBT anyway) and
then supressing the fallthrough warning for all functions that start
with "__cfi_". This way we get an ORC entry for them and the unwinder
will be happy.

2022-05-05 17:03:34

by Josh Poimboeuf

[permalink] [raw]
Subject: Re: [RFC PATCH 01/11] x86: kernel FineIBT

On Mon, May 02, 2022 at 10:17:42AM -0700, Joao Moreira wrote:
> > > + asm("\t movq 0x90(%%rsp),%0" : "=r"(ret));
> > > + asm("\t movq 0x98(%%rsp),%0" : "=r"(caller));
> >
> > This is making some questionable assumptions about the stack layout.
> >
> > I assume this function is still in the prototype stage ;-)
>
> Yeah, this is just a messy instrumentation to get reports about mismatching
> prototypes (as the ones reported further down the series).
>
> The issue with having the call is that it bloats the binary, so the ud2 is
> 3-bytes-per-function better. Yet, we may consider a FINEIBT_DEBUG config,
> which can then enable a handler. This would be useful together with a fuzzer
> or a stress tool to cover possible control-flow paths within the kernel and
> map mismatching prototypes more properly I guess.

It should be possible to have a non-fatal #UD2 handler.

See for example how WARN() is implemented with __WARN_FLAGS in
arch/x86/include/asm/bug.h .

So hopefully we can just get rid of the need for the "call handler"
thing altogether.

> > Not sure what would happen for "ibt=off"? Maybe apply_ibt_endbr() could
> > NOP out all the FineIBT stuff.
>
> Either that, or...
>
> I'm thinking about a way to have FineIBT interchangeable with KCFI.
> Currently KCFI adds a 4 byte hash + 2 byte nops before function entry, to
> allow for proper prototype checking. After that, there should be an ENDBR of
> 4 bytes. This gives us 10 bytes in total. Then, my yet to be properly
> thought idea would be patch these 10 bytes with:
>
> endbr
> call fineibt_handler_<$HASH>
> nop
>
> and then, on the caller side, patch the "cmp <$HASH>, -0x6(%r11); je; ud2;
> call" sequence with a "sub 0x6, r11; mov $HASH, %r10; call %r11, add 0x6
> %r11". This would then allow the kernel to verify if the CPU is IBT capable
> on boot time and only then setting the proper scheme.
>
> The downsides of having something like this would be that this sub r11/add
> r11 sequence is kinda meh. We can avoid that by having two padding nops
> after the original ENDBR, which will be skipped when the function is reached
> directly by the linker optimization I'm working on, and that we can convert
> into a JMP -offset that makes control flow reach the padding area before the
> prologue and from where we can call the fineibt_handler function. The
> resulting instrumentation would be something like:
>
> 1:
> call fineibt_handler_<$HASH>
> jmp 2f
> <foo>
> endbr
> jmp 1b
> 2:
>
> Also, it would prevent a paranoid user to have both schemes simultaneously
> (there are reasons why people could want that).
>
> Any thoughts?

I'm not really qualified to comment on this too directly since I haven't
looked very much at the variations on FineIBT/CFI/KCFI, and what the
protections and drawbacks are for each approach, and when it might even
make sense to combine them for a "paranoid user".

Since we have multiple similar and possibly competing technologies being
discussed, one thing I do want to warn against is that we as kernel
developers tend to err on the side of giving people too many choices and
combinations which *never* get used.

All those unused options can confuse the user and significantly add to
complexity and maintenance overhead for us. Especially for invasive
features like these.

(Just to be clear, I'm not saying that's happening here, but it's
something we need to be careful about.)

Here, documentation is going to be crucial, for both reviewers and
users. Something that describes when/why I should use X or Y or X+Y.

If we truly want to add more options/combos for different use cases then
we'll also need clear and concise documentation about which
options/combos would be used under what circumstances.

--
Josh


2022-05-09 03:18:26

by Kees Cook

[permalink] [raw]
Subject: Re: [RFC PATCH 01/11] x86: kernel FineIBT

On Wed, May 04, 2022 at 08:16:57PM +0200, Peter Zijlstra wrote:
> FineIBT kCFI
>
> __fineibt_\hash:
> xor \hash, %r10 # 7
> jz 1f # 2
> ud2 # 2
> 1: ret # 1
> int3 # 1
>
>
> __cfi_\sym: __cfi_\sym:
> int3; int3 # 2
> endbr # 4 mov \hash, %eax # 5
> call __fineibt_\hash # 5 int3; int3 # 2
> \sym: \sym:
> ... ...
>
>
> caller: caller:
> movl \hash, %r10d # 6 cmpl \hash, -6(%r11) # 8
> sub $9, %r11 # 4 je 1f # 2
> call *%r11 # 3 ud2 # 2
> .nop 4 # 4 (or fixup r11) call __x86_indirect_thunk_r11 # 5

This looks good!

And just to double-check my understanding here... \sym is expected to
start with endbr with IBT + kCFI?


Random extra thoughts... feel free to ignore. :) Given that both CFI
schemes depend on an attacker not being able to construct an executable
memory region that either starts with endbr (for FineIBT) or starts with
hash & 2 bytes (for kCFI), we should likely take another look at where
the kernel uses PAGE_KERNEL_EXEC.

It seems non-specialized use is entirely done via module_alloc(). Obviously
modules need to stay as-is. So we're left with other module_alloc()
callers: BPF JIT, ftrace, and kprobes.

Perhaps enabling CFI should tie bpf_jit_harden (which performs constant
blinding) to the value of bpf_jit_enable? (i.e. either use BPF VM which
reads from non-exec memory, or use BPF JIT with constant blinding.)

I *think* all the kprobes and ftrace stuff ends up using constructed
direct calls, though, yes? So if we did bounds checking, we could
"exclude" them as well as the BPF JIT. Though I'm not sure how
controllable the content written to the kprobes and ftrace regions are,
though?

For exclusion, we could separate actual modules from the other
module_alloc() users by maybe allocating in opposite directions from the
randomized offset and check indirect calls against the kernel text bounds
and the new modules-only bounds. Sounds expensive, though. Maybe PKS,
but I can't imagine 2 MSR writes per indirect call would be fast. Hmm...

--
Kees Cook

2022-05-09 05:29:23

by Peter Collingbourne

[permalink] [raw]
Subject: Re: [RFC PATCH 01/11] x86: kernel FineIBT

On Wed, May 04, 2022 at 12:20:19PM +0200, Peter Zijlstra wrote:
> On Tue, May 03, 2022 at 03:02:44PM -0700, Josh Poimboeuf wrote:
>
> > I'm not really qualified to comment on this too directly since I haven't
> > looked very much at the variations on FineIBT/CFI/KCFI, and what the
> > protections and drawbacks are for each approach, and when it might even
> > make sense to combine them for a "paranoid user".
> >
> > Since we have multiple similar and possibly competing technologies being
> > discussed, one thing I do want to warn against is that we as kernel
> > developers tend to err on the side of giving people too many choices and
> > combinations which *never* get used.
>
> So I don't think there's going to be a user choice here. If there's
> hardware support, FineIBT makes more sense. That also means that kCFI no
> longer needs to worry about IBT.
>
> If we do something like:
>
>
> kCFI FineIBT
>
> __cfi_\sym: __cfi_\sym:
> endbr # 4 endbr # 4
> sub $hash, %r10 # 7 sub $hash, %r10 # 7
> je \sym # 2 je \sym # 2
> ud2 # 2 ud2 # 2
> \sym: \sym:
>
>
> caller: caller:
> cmpl $hash, -8(%r11) # 8 movl $hash, %r10d # 6
> je 1f # 2 sub 15, %r11 # 4
> ud2 # 2 call *%r11 # 3
> 1: call __x86_indirect_thunk_r11 # 5 .nop 4 # 4 (could even fix up r11 again)
>
>
> Then, all that's required is a slight tweak to apply_retpolines() to
> rewrite a little more text.
>
> Note that this also does away with having to fix up the linker, since
> all direct call will already point at \sym. It's just the IBT indirect
> calls that need to frob the pointer in order to hit the ENDBR.
>
> On top of that, we no longer have to special case the objtool
> instruction decoder, the prelude are proper instructions now.

For kCFI this brings back the gadget problem that I mentioned here:
https://lore.kernel.org/all/[email protected]/

because the hash at the call site is 8 bytes before the call
instruction.

Peter

2022-05-09 08:03:31

by Sami Tolvanen

[permalink] [raw]
Subject: Re: [RFC PATCH 01/11] x86: kernel FineIBT

On Wed, May 4, 2022 at 11:17 AM Peter Zijlstra <[email protected]> wrote:
> __cfi_\sym: __cfi_\sym:
> int3; int3 # 2
> endbr # 4 mov \hash, %eax # 5
> call __fineibt_\hash # 5 int3; int3 # 2
> \sym: \sym:

OK, that looks reasonable to me.

> It seems to preserve the properties of the last one in that direct calls
> will already be correct and we don't need linker fixups, and objtool can
> simply parse the preamble as regular instructions without needing
> further help.

Wouldn't objtool still print out unreachable instruction warnings here?

Sami

2022-05-09 12:03:00

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [RFC PATCH 01/11] x86: kernel FineIBT

On Sun, May 08, 2022 at 01:29:13AM -0700, Kees Cook wrote:
> On Wed, May 04, 2022 at 08:16:57PM +0200, Peter Zijlstra wrote:
> > FineIBT kCFI
> >
> > __fineibt_\hash:
> > xor \hash, %r10 # 7
> > jz 1f # 2
> > ud2 # 2
> > 1: ret # 1
> > int3 # 1
> >
> >
> > __cfi_\sym: __cfi_\sym:
> > int3; int3 # 2
> > endbr # 4 mov \hash, %eax # 5
> > call __fineibt_\hash # 5 int3; int3 # 2
> > \sym: \sym:
> > ... ...
> >
> >
> > caller: caller:
> > movl \hash, %r10d # 6 cmpl \hash, -6(%r11) # 8
> > sub $9, %r11 # 4 je 1f # 2
> > call *%r11 # 3 ud2 # 2
> > .nop 4 # 4 (or fixup r11) call __x86_indirect_thunk_r11 # 5
>
> This looks good!
>
> And just to double-check my understanding here... \sym is expected to
> start with endbr with IBT + kCFI?

Ah, the thinking was that 'if IBT then FineIBT', so the combination of
kCFI and IBT is of no concern. And since FineIBT will have the ENDBR in
the __cfi_\sym thing, \sym will not need it.

But thinking about this now I suppose __nocfi call symbols will stlil
need the ENDBR on. Objtool IBT validation would need to either find
ENDBR or a matching __cfi_\sym I suppose.

So I was talking to Joao on IRC the other day, and I realized that if
kCFI generates code as per the above, then we can do FineIBT purely
in-kernel. That is; have objtool generate a section of __cfi_\sym
locations. Then use the .retpoline_sites and .cfi_sites to rewrite kCFI
into the FineIBT form in multi pass:

- read all the __cfi_\sym sites and collect all unique hash values

- allocate (module) memory and write __fineibt_\hash functions for each
unique hash value found

- rewrite callers; nop out kCFI

- rewrite all __cfi_\sym

- rewrite all callers

- enable IBT

And the same on module load I suppose.

But I've only thought about this, not actually implemented it, so who
knows what surprises are lurking there :-)

> Random extra thoughts... feel free to ignore. :) Given that both CFI
> schemes depend on an attacker not being able to construct an executable
> memory region that either starts with endbr (for FineIBT) or starts with
> hash & 2 bytes (for kCFI), we should likely take another look at where
> the kernel uses PAGE_KERNEL_EXEC.
>
> It seems non-specialized use is entirely done via module_alloc(). Obviously
> modules need to stay as-is. So we're left with other module_alloc()
> callers: BPF JIT, ftrace, and kprobes.
>
> Perhaps enabling CFI should tie bpf_jit_harden (which performs constant
> blinding) to the value of bpf_jit_enable? (i.e. either use BPF VM which
> reads from non-exec memory, or use BPF JIT with constant blinding.)
>
> I *think* all the kprobes and ftrace stuff ends up using constructed
> direct calls, though, yes? So if we did bounds checking, we could
> "exclude" them as well as the BPF JIT. Though I'm not sure how
> controllable the content written to the kprobes and ftrace regions are,
> though?

Both ftrace and kprobe only write fairly simple tramplines based off of
a template. Neither has indirect calls.

> For exclusion, we could separate actual modules from the other
> module_alloc() users by maybe allocating in opposite directions from the
> randomized offset and check indirect calls against the kernel text bounds
> and the new modules-only bounds. Sounds expensive, though. Maybe PKS,
> but I can't imagine 2 MSR writes per indirect call would be fast. Hmm...

I'm not sure what problem you're trying to solve..