2022-02-24 16:21:24

by Peter Zijlstra

[permalink] [raw]
Subject: [PATCH v2 00/39] x86: Kernel IBT

Hi,

This is an even more complete Kernel IBT implementation.

Since last time (in no specific order):

- Reworked Xen and paravirt bits lots (andyhpp)
- Reworked entry annotation (jpoimboe)
- Renamed CONFIG symbol to CONFIG_X86_KERNEL_IBT (redgecomb)
- Pinned CR4_CET (kees)
- Added __noendbr to CET control functions (kees)
- kexec (redgecomb)
- made function-graph, kprobes and bpf not explode (rostedt)
- cleanups and split ups (jpoimboe, mbenes)
- reworked whole module objtool (nathanchance)
- attempted and failed at making Clang go

Specifically to clang; I made clang-13 explode by rediscovering:
https://reviews.llvm.org/D111108, then I tried clang-14 but it looks like
ld.lld is still generating .plt entries out of thin air.

Also, I know the very first patch is somewhat controversial amonst the clang
people, but I really think the current state of affairs is abysmal and this
lets me at least use clang.

Patches are also available here:

git://git.kernel.org/pub/scm/linux/kernel/git/peterz/queue.git x86/wip.ibt

This series is on top of tip/master along with the linkage patches from Mark:

https://lore.kernel.org/all/[email protected]/

Enjoy!


2022-02-24 22:15:46

by Josh Poimboeuf

[permalink] [raw]
Subject: Re: [PATCH v2 00/39] x86: Kernel IBT

On Thu, Feb 24, 2022 at 03:51:38PM +0100, Peter Zijlstra wrote:
> Hi,
>
> This is an even more complete Kernel IBT implementation.
>
> Since last time (in no specific order):
>
> - Reworked Xen and paravirt bits lots (andyhpp)
> - Reworked entry annotation (jpoimboe)
> - Renamed CONFIG symbol to CONFIG_X86_KERNEL_IBT (redgecomb)
> - Pinned CR4_CET (kees)
> - Added __noendbr to CET control functions (kees)
> - kexec (redgecomb)
> - made function-graph, kprobes and bpf not explode (rostedt)
> - cleanups and split ups (jpoimboe, mbenes)
> - reworked whole module objtool (nathanchance)
> - attempted and failed at making Clang go
>
> Specifically to clang; I made clang-13 explode by rediscovering:
> https://reviews.llvm.org/D111108, then I tried clang-14 but it looks like
> ld.lld is still generating .plt entries out of thin air.
>
> Also, I know the very first patch is somewhat controversial amonst the clang
> people, but I really think the current state of affairs is abysmal and this
> lets me at least use clang.
>
> Patches are also available here:
>
> git://git.kernel.org/pub/scm/linux/kernel/git/peterz/queue.git x86/wip.ibt

Bricked my SPR:

[ 21.602888] jump_label: Fatal kernel bug, unexpected op at sched_clock_stable+0x4/0x20 [0000000074a0db20] (eb 06 b8 01 00 != eb 0a 00 00 00)) size:2 type:0
[ 21.618489] ------------[ cut here ]------------
[ 21.623706] kernel BUG at arch/x86/kernel/jump_label.c:73!
[ 21.629903] invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
[ 21.630897] CPU: 1 PID: 1 Comm: swapper/0 Tainted: G I 5.17.0-rc5+ #3
[ 21.630897] Hardware name: Intel Corporation ArcherCity/ArcherCity, BIOS EGSDCRB1.86B.0064.D15.2109031959 09/03/2021
[ 21.630897] RIP: 0010:__jump_label_patch.cold.0+0x24/0x26
[ 21.630897] Code: 9f e9 9d b1 5f ff 48 c7 c3 a8 44 65 a0 41 55 45 89 f1 49 89 d8 4c 89 e1 4c 89 e2 4c 89 e6 48 c7 c7 78 00 77 9f e8 e8 a8 00 00 <0f> 0b 48 89 fb 48 c7 c6 f0 03 77 9f 48 8d bf d0 00 00 00 e8 41 16
[ 21.630897] RSP: 0000:ff7af25cc01cfd68 EFLAGS: 00010246
[ 21.630897] RAX: 000000000000008f RBX: ffffffffa06544a8 RCX: 0000000000000001
[ 21.630897] RDX: 0000000000000000 RSI: 00000000fffeffff RDI: 00000000ffffffff
[ 21.630897] RBP: ff7af25cc01cfd98 R08: 0000000000000000 R09: c0000000fffeffff
[ 21.630897] R10: 0000000000000001 R11: ff7af25cc01cfb88 R12: ffffffff9e520a74
[ 21.630897] R13: 0000000000000000 R14: 0000000000000002 R15: ffffffff9f7293b7
[ 21.630897] FS: 0000000000000000(0000) GS:ff422b3e2cc40000(0000) knlGS:0000000000000000
[ 21.630897] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 21.630897] CR2: 0000000000000000 CR3: 0000001994810001 CR4: 0000000000f71ee0
[ 21.630897] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 21.630897] DR3: 0000000000000000 DR6: 00000000fffe07f0 DR7: 0000000000000400
[ 21.630897] PKRU: 55555554
[ 21.630897] Call Trace:
[ 21.630897] <TASK>
[ 21.630897] arch_jump_label_transform_queue+0x37/0x80
[ 21.630897] __jump_label_update+0x74/0x130
[ 21.630897] ? sched_init+0x556/0x556
[ 21.630897] ? rdinit_setup+0x30/0x30
[ 21.630897] static_key_enable_cpuslocked+0x5b/0x90
[ 21.630897] static_key_enable+0x1a/0x20
[ 21.630897] sched_clock_init_late+0x7a/0x95
[ 21.630897] do_one_initcall+0x45/0x200
[ 21.630897] kernel_init_freeable+0x211/0x27a
[ 21.630897] ? rest_init+0xd0/0xd0
[ 21.630897] kernel_init+0x1a/0x130
[ 21.630897] ret_from_fork+0x1f/0x30
[ 21.630897] </TASK>
[ 21.630897] Modules linked in:
[ 21.838238] ---[ end trace 0000000000000000 ]---


ffffffff81120a70 <sched_clock_stable>:
ffffffff81120a70: f3 0f 1e fa endbr64
ffffffff81120a74: eb 06 jmp ffffffff81120a7c <sched_clock_stable+0xc>
ffffffff81120a76: b8 01 00 00 00 mov $0x1,%eax
ffffffff81120a7b: c3 retq
ffffffff81120a7c: f3 0f 1e fa endbr64
ffffffff81120a80: 31 c0 xor %eax,%eax
ffffffff81120a82: c3 retq
ffffffff81120a83: 66 66 2e 0f 1f 84 00 data16 nopw %cs:0x0(%rax,%rax,1)
ffffffff81120a8a: 00 00 00 00
ffffffff81120a8e: 66 90 xchg %ax,%ax

2022-02-25 17:13:00

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH v2 00/39] x86: Kernel IBT

On Thu, Feb 24, 2022 at 12:26:02PM -0800, Josh Poimboeuf wrote:

> Bricked my SPR:
>
> [ 21.602888] jump_label: Fatal kernel bug, unexpected op at sched_clock_stable+0x4/0x20 [0000000074a0db20] (eb 06 b8 01 00 != eb 0a 00 00 00)) size:2 type:0

> ffffffff81120a70 <sched_clock_stable>:
> ffffffff81120a70: f3 0f 1e fa endbr64
> ffffffff81120a74: eb 06 jmp ffffffff81120a7c <sched_clock_stable+0xc>
> ffffffff81120a76: b8 01 00 00 00 mov $0x1,%eax
> ffffffff81120a7b: c3 retq
> ffffffff81120a7c: f3 0f 1e fa endbr64
> ffffffff81120a80: 31 c0 xor %eax,%eax
> ffffffff81120a82: c3 retq
> ffffffff81120a83: 66 66 2e 0f 1f 84 00 data16 nopw %cs:0x0(%rax,%rax,1)
> ffffffff81120a8a: 00 00 00 00
> ffffffff81120a8e: 66 90 xchg %ax,%ax

This is due to you having a very old (and arguably buggy) compiler :-( I
can reproduce with gcc-8.4 and gcc-9.4, my gcc-10.3 compiler no longer
generates daft code like that, nor do any later.

That said, I can fix objtool to also re-write jumps to in-the-middle
ENDBR like this, but then I do get a bunch of:

OBJTOOL vmlinux.o
vmlinux.o: warning: objtool: displacement doesn't fit
vmlinux.o: warning: objtool: ep_insert()+0xbc5: Direct IMM jump to ENDBR; cannot fix
vmlinux.o: warning: objtool: displacement doesn't fit
vmlinux.o: warning: objtool: configfs_depend_prep()+0x76: Direct IMM jump to ENDBR; cannot fix
vmlinux.o: warning: objtool: displacement doesn't fit
vmlinux.o: warning: objtool: request_key_and_link()+0x17b: Direct IMM jump to ENDBR; cannot fix
vmlinux.o: warning: objtool: displacement doesn't fit
vmlinux.o: warning: objtool: blk_mq_poll()+0x2e0: Direct IMM jump to ENDBR; cannot fix

The alternative is only skipping endbr at +0 I suppose, lemme go try
that with the brand spanking new skip_endbr() function.

Yep,.. that seems to cure things. It noaw boats when build with old
crappy compilers too.


--- a/arch/x86/include/asm/ibt.h
+++ b/arch/x86/include/asm/ibt.h
@@ -47,6 +47,8 @@ static inline bool is_endbr(unsigned int
return val == gen_endbr();
}

+extern void *skip_endbr(void *);
+
extern __noendbr u64 ibt_save(void);
extern __noendbr void ibt_restore(u64 save);

@@ -71,6 +73,7 @@ extern __noendbr void ibt_restore(u64 sa
#define __noendbr

static inline bool is_endbr(unsigned int val) { return false; }
+static inline void *skip_endbr(void *addr) { return addr; }

static inline u64 ibt_save(void) { return 0; }
static inline void ibt_restore(u64 save) { }
--- a/arch/x86/include/asm/text-patching.h
+++ b/arch/x86/include/asm/text-patching.h
@@ -112,10 +112,7 @@ void __text_gen_insn(void *buf, u8 opcod
OPTIMIZER_HIDE_VAR(addr);
OPTIMIZER_HIDE_VAR(dest);

-#ifdef CONFIG_X86_KERNEL_IBT
- if (is_endbr(*(u32 *)dest))
- dest += 4;
-#endif
+ dest = skip_endbr((void *)dest);

insn->opcode = opcode;

--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -620,6 +620,19 @@ __noendbr void ibt_restore(u64 save)
}
}

+
+void *skip_endbr(void *addr)
+{
+ unsigned long size, offset;
+
+ if (is_endbr(*(unsigned int *)addr) &&
+ kallsyms_lookup_size_offset((unsigned long)addr, &size, &offset) &&
+ !offset)
+ addr += 4;
+
+ return addr;
+}
+
#endif

static __always_inline void setup_cet(struct cpuinfo_x86 *c)
@@ -636,7 +649,10 @@ static __always_inline void setup_cet(st
if (!ibt_selftest()) {
pr_err("IBT selftest: Failed!\n");
setup_clear_cpu_cap(X86_FEATURE_IBT);
+ return;
}
+
+ pr_info("CET detected: Indirect Branch Tracking enabled\n");
}

__noendbr void cet_disable(void)
--- a/arch/x86/net/bpf_jit_comp.c
+++ b/arch/x86/net/bpf_jit_comp.c
@@ -350,18 +350,12 @@ static int __bpf_arch_text_poke(void *ip
u8 *prog;
int ret;

-#ifdef CONFIG_X86_KERNEL_IBT
- if (is_endbr(*(u32 *)ip))
- ip += 4;
-#endif
+ ip = skip_endbr(ip);

memcpy(old_insn, nop_insn, X86_PATCH_SIZE);
if (old_addr) {
prog = old_insn;
-#ifdef CONFIG_X86_KERNEL_IBT
- if (is_endbr(*(u32 *)old_addr))
- old_addr += 4;
-#endif
+ old_addr = skip_endbr(old_addr);
ret = t == BPF_MOD_CALL ?
emit_call(&prog, old_addr, ip) :
emit_jump(&prog, old_addr, ip);
@@ -372,10 +366,7 @@ static int __bpf_arch_text_poke(void *ip
memcpy(new_insn, nop_insn, X86_PATCH_SIZE);
if (new_addr) {
prog = new_insn;
-#ifdef CONFIG_X86_KERNEL_IBT
- if (is_endbr(*(u32 *)new_addr))
- new_addr += 4;
-#endif
+ new_addr = skip_endbr(new_addr);
ret = t == BPF_MOD_CALL ?
emit_call(&prog, new_addr, ip) :
emit_jump(&prog, new_addr, ip);

2022-02-26 01:45:59

by Josh Poimboeuf

[permalink] [raw]
Subject: Re: [PATCH v2 00/39] x86: Kernel IBT

On Fri, Feb 25, 2022 at 04:43:20PM +0100, Peter Zijlstra wrote:
> On Fri, Feb 25, 2022 at 04:28:32PM +0100, Peter Zijlstra wrote:
> > +void *skip_endbr(void *addr)
> > +{
> > + unsigned long size, offset;
> > +
> > + if (is_endbr(*(unsigned int *)addr) &&
> > + kallsyms_lookup_size_offset((unsigned long)addr, &size, &offset) &&
> > + !offset)
> > + addr += 4;
> > +
> > + return addr;
> > +}
>
> Damn, I just realized this makes KERNEL_IBT hard depend on KALLSYMS :-(

Why should the jump label patching code even care whether there's an
ENDBR at the jump target? It should never jump to the beginning of a
function anyway, right? And objtool presumably doesn't patch out ENDBRs
in the middle of a function.

--
Josh

2022-02-26 01:55:37

by Steven Rostedt

[permalink] [raw]
Subject: Re: [PATCH v2 00/39] x86: Kernel IBT

On Fri, 25 Feb 2022 09:26:44 -0800
Josh Poimboeuf <[email protected]> wrote:

> > Damn, I just realized this makes KERNEL_IBT hard depend on KALLSYMS :-(
>
> Why should the jump label patching code even care whether there's an
> ENDBR at the jump target? It should never jump to the beginning of a
> function anyway, right? And objtool presumably doesn't patch out ENDBRs
> in the middle of a function.

Perhaps Peter confused jump labels with static calls?

-- Steve

2022-02-26 02:02:15

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH v2 00/39] x86: Kernel IBT

On Fri, Feb 25, 2022 at 12:32:38PM -0500, Steven Rostedt wrote:
> On Fri, 25 Feb 2022 09:26:44 -0800
> Josh Poimboeuf <[email protected]> wrote:
>
> > > Damn, I just realized this makes KERNEL_IBT hard depend on KALLSYMS :-(
> >
> > Why should the jump label patching code even care whether there's an
> > ENDBR at the jump target? It should never jump to the beginning of a
> > function anyway, right? And objtool presumably doesn't patch out ENDBRs
> > in the middle of a function.
>
> Perhaps Peter confused jump labels with static calls?

The code is shared between the two.

2022-02-26 02:17:13

by Josh Poimboeuf

[permalink] [raw]
Subject: Re: [PATCH v2 00/39] x86: Kernel IBT

On Fri, Feb 25, 2022 at 08:53:03PM +0100, Peter Zijlstra wrote:
> On Fri, Feb 25, 2022 at 12:32:38PM -0500, Steven Rostedt wrote:
> > On Fri, 25 Feb 2022 09:26:44 -0800
> > Josh Poimboeuf <[email protected]> wrote:
> >
> > > > Damn, I just realized this makes KERNEL_IBT hard depend on KALLSYMS :-(
> > >
> > > Why should the jump label patching code even care whether there's an
> > > ENDBR at the jump target? It should never jump to the beginning of a
> > > function anyway, right? And objtool presumably doesn't patch out ENDBRs
> > > in the middle of a function.
> >
> > Perhaps Peter confused jump labels with static calls?
>
> The code is shared between the two.

Then just have a text_gen_insn_dont_skip_endbr() or so... ?

--
Josh

2022-02-26 02:29:03

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH v2 00/39] x86: Kernel IBT

On Fri, Feb 25, 2022 at 04:28:32PM +0100, Peter Zijlstra wrote:
> +void *skip_endbr(void *addr)
> +{
> + unsigned long size, offset;
> +
> + if (is_endbr(*(unsigned int *)addr) &&
> + kallsyms_lookup_size_offset((unsigned long)addr, &size, &offset) &&
> + !offset)
> + addr += 4;
> +
> + return addr;
> +}

Damn, I just realized this makes KERNEL_IBT hard depend on KALLSYMS :-(

2022-03-02 01:13:51

by Josh Poimboeuf

[permalink] [raw]
Subject: Re: [PATCH v2 00/39] x86: Kernel IBT

On Fri, Feb 25, 2022 at 04:28:32PM +0100, Peter Zijlstra wrote:
> On Thu, Feb 24, 2022 at 12:26:02PM -0800, Josh Poimboeuf wrote:
>
> > Bricked my SPR:
> >
> > [ 21.602888] jump_label: Fatal kernel bug, unexpected op at sched_clock_stable+0x4/0x20 [0000000074a0db20] (eb 06 b8 01 00 != eb 0a 00 00 00)) size:2 type:0
>
> > ffffffff81120a70 <sched_clock_stable>:
> > ffffffff81120a70: f3 0f 1e fa endbr64
> > ffffffff81120a74: eb 06 jmp ffffffff81120a7c <sched_clock_stable+0xc>
> > ffffffff81120a76: b8 01 00 00 00 mov $0x1,%eax
> > ffffffff81120a7b: c3 retq
> > ffffffff81120a7c: f3 0f 1e fa endbr64
> > ffffffff81120a80: 31 c0 xor %eax,%eax
> > ffffffff81120a82: c3 retq
> > ffffffff81120a83: 66 66 2e 0f 1f 84 00 data16 nopw %cs:0x0(%rax,%rax,1)
> > ffffffff81120a8a: 00 00 00 00
> > ffffffff81120a8e: 66 90 xchg %ax,%ax
>
> This is due to you having a very old (and arguably buggy) compiler :-( I
> can reproduce with gcc-8.4 and gcc-9.4, my gcc-10.3 compiler no longer
> generates daft code like that, nor do any later.
>
> That said, I can fix objtool to also re-write jumps to in-the-middle
> ENDBR like this, but then I do get a bunch of:
>
> OBJTOOL vmlinux.o
> vmlinux.o: warning: objtool: displacement doesn't fit
> vmlinux.o: warning: objtool: ep_insert()+0xbc5: Direct IMM jump to ENDBR; cannot fix
> vmlinux.o: warning: objtool: displacement doesn't fit
> vmlinux.o: warning: objtool: configfs_depend_prep()+0x76: Direct IMM jump to ENDBR; cannot fix
> vmlinux.o: warning: objtool: displacement doesn't fit
> vmlinux.o: warning: objtool: request_key_and_link()+0x17b: Direct IMM jump to ENDBR; cannot fix
> vmlinux.o: warning: objtool: displacement doesn't fit
> vmlinux.o: warning: objtool: blk_mq_poll()+0x2e0: Direct IMM jump to ENDBR; cannot fix
>
> The alternative is only skipping endbr at +0 I suppose, lemme go try
> that with the brand spanking new skip_endbr() function.
>
> Yep,.. that seems to cure things. It noaw boats when build with old
> crappy compilers too.
>
>
> --- a/arch/x86/include/asm/ibt.h
> +++ b/arch/x86/include/asm/ibt.h
> @@ -47,6 +47,8 @@ static inline bool is_endbr(unsigned int
> return val == gen_endbr();
> }
>
> +extern void *skip_endbr(void *);
> +
> extern __noendbr u64 ibt_save(void);
> extern __noendbr void ibt_restore(u64 save);
>
> @@ -71,6 +73,7 @@ extern __noendbr void ibt_restore(u64 sa
> #define __noendbr
>
> static inline bool is_endbr(unsigned int val) { return false; }
> +static inline void *skip_endbr(void *addr) { return addr; }
>
> static inline u64 ibt_save(void) { return 0; }
> static inline void ibt_restore(u64 save) { }
> --- a/arch/x86/include/asm/text-patching.h
> +++ b/arch/x86/include/asm/text-patching.h
> @@ -112,10 +112,7 @@ void __text_gen_insn(void *buf, u8 opcod
> OPTIMIZER_HIDE_VAR(addr);
> OPTIMIZER_HIDE_VAR(dest);
>
> -#ifdef CONFIG_X86_KERNEL_IBT
> - if (is_endbr(*(u32 *)dest))
> - dest += 4;
> -#endif
> + dest = skip_endbr((void *)dest);
>
> insn->opcode = opcode;
>
> --- a/arch/x86/kernel/cpu/common.c
> +++ b/arch/x86/kernel/cpu/common.c
> @@ -620,6 +620,19 @@ __noendbr void ibt_restore(u64 save)
> }
> }
>
> +
> +void *skip_endbr(void *addr)
> +{
> + unsigned long size, offset;
> +
> + if (is_endbr(*(unsigned int *)addr) &&
> + kallsyms_lookup_size_offset((unsigned long)addr, &size, &offset) &&
> + !offset)
> + addr += 4;
> +
> + return addr;
> +}
> +
> #endif
>
> static __always_inline void setup_cet(struct cpuinfo_x86 *c)
> @@ -636,7 +649,10 @@ static __always_inline void setup_cet(st
> if (!ibt_selftest()) {
> pr_err("IBT selftest: Failed!\n");
> setup_clear_cpu_cap(X86_FEATURE_IBT);
> + return;
> }
> +
> + pr_info("CET detected: Indirect Branch Tracking enabled\n");

This is a little excessive on my 192 CPUs :-)

It also messes with the pr_cont()s in announce_cpu():

[ 3.733446] x86: Booting SMP configuration:
[ 3.734342] .... node #0, CPUs: #1
[ 3.534902] CET detected: Indirect Branch Tracking enabled
[ 3.770955] #2
[ 3.534902] CET detected: Indirect Branch Tracking enabled
[ 3.802979] #3
[ 3.534902] CET detected: Indirect Branch Tracking enabled
[ 3.835459] #4
[ 3.534902] CET detected: Indirect Branch Tracking enabled
[ 3.866826] #5
[ 3.534902] CET detected: Indirect Branch Tracking enabled
[ 3.898690] #6
[ 3.534902] CET detected: Indirect Branch Tracking enabled
[ 3.930355] #7
[ 3.534902] CET detected: Indirect Branch Tracking enabled
[ 3.961493] #8
[ 3.534902] CET detected: Indirect Branch Tracking enabled
[ 3.993500] #9
[ 3.534902] CET detected: Indirect Branch Tracking enabled
[ 4.024952] #10
[ 3.534902] CET detected: Indirect Branch Tracking enabled
[ 4.056491] #11
[ 3.534902] CET detected: Indirect Branch Tracking enabled
[ 4.087493] #12
[ 3.534902] CET detected: Indirect Branch Tracking enabled
[ 4.118907] #13
[ 3.534902] CET detected: Indirect Branch Tracking enabled
[ 4.150494] #14
[ 3.534902] CET detected: Indirect Branch Tracking enabled
[ 4.181425] #15


--
Josh

2022-03-02 11:06:28

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH v2 00/39] x86: Kernel IBT

On Tue, Mar 01, 2022 at 03:10:22PM -0800, Josh Poimboeuf wrote:
> On Fri, Feb 25, 2022 at 04:28:32PM +0100, Peter Zijlstra wrote:
> > @@ -636,7 +649,10 @@ static __always_inline void setup_cet(st
> > if (!ibt_selftest()) {
> > pr_err("IBT selftest: Failed!\n");
> > setup_clear_cpu_cap(X86_FEATURE_IBT);
> > + return;
> > }
> > +
> > + pr_info("CET detected: Indirect Branch Tracking enabled\n");
>
> This is a little excessive on my 192 CPUs :-)
>
> It also messes with the pr_cont()s in announce_cpu():

Hehe, I just noticed the same when looking at logs trying to figure out
if kexec worked. I'll go fix.