Hi Peter,
Here I tried to fix 2 issues discussed on the previous thread;
https://lore.kernel.org/all/[email protected]/
- Prohibit probing on __cfi_* preamble symbols, which have the typeid.
- Prohibit probing on compiler generated movl/addl which is used for
detecting typeid on x86.
I'm not sure how arm64 implemented, but it seems
cfi_handler()@arch/arm64/kernel/traps.c just reads the registers for
the typeid instead of decoding the instructions.
I just build tested, since I could not boot the kernel with CFI_CLANG=y.
Would anyone know something about this error?
[ 0.141030] MMIO Stale Data: Unknown: No mitigations
[ 0.153511] SMP alternatives: Using kCFI
[ 0.164593] Freeing SMP alternatives memory: 36K
[ 0.165053] Kernel panic - not syncing: stack-protector: Kernel stack is corrupted in: start_kernel+0x472/0x48b
[ 0.166028] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 6.4.2-00002-g12b1b2fca8ef #126
[ 0.166028] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1.1 04/01/2014
[ 0.166028] Call Trace:
[ 0.166028] <TASK>
[ 0.166028] dump_stack_lvl+0x6e/0xb0
[ 0.166028] panic+0x146/0x2f0
[ 0.166028] ? start_kernel+0x472/0x48b
[ 0.166028] __stack_chk_fail+0x14/0x20
[ 0.166028] start_kernel+0x472/0x48b
[ 0.166028] x86_64_start_reservations+0x24/0x30
[ 0.166028] x86_64_start_kernel+0xa6/0xbb
[ 0.166028] secondary_startup_64_no_verify+0x106/0x11b
[ 0.166028] </TASK>
[ 0.166028] ---[ end Kernel panic - not syncing: stack-protector: Kernel stack is corrupted in: start_kernel+0x472/0x48b ]---
Thank you,
---
Masami Hiramatsu (Google) (2):
kprobes: Prohibit probing on CFI preamble symbol
x86/kprobes: Prohibit probing on compiler generated CFI checking code
arch/x86/kernel/kprobes/core.c | 34 ++++++++++++++++++++++++++++++++++
kernel/kprobes.c | 17 ++++++++++++++++-
2 files changed, 50 insertions(+), 1 deletion(-)
--
Masami Hiramatsu (Google) <[email protected]>
From: Masami Hiramatsu (Google) <[email protected]>
Prohibit probing on the compiler generated CFI typeid checking code
because it is used for decoding typeid when CFI error happens.
The compiler generates the following instruction sequence for indirect
call checks on x86;
movl -<id>, %r10d ; 6 bytes
addl -4(%reg), %r10d ; 4 bytes
je .Ltmp1 ; 2 bytes
ud2 ; <- regs->ip
And handle_cfi_failure() decodes these instructions (movl and addl)
for the typeid and the target address. Thus if we put a kprobe on
those instructions, the decode will fail and report a wrong typeid
and target address.
Signed-off-by: Masami Hiramatsu (Google) <[email protected]>
---
arch/x86/kernel/kprobes/core.c | 34 ++++++++++++++++++++++++++++++++++
1 file changed, 34 insertions(+)
diff --git a/arch/x86/kernel/kprobes/core.c b/arch/x86/kernel/kprobes/core.c
index f7f6042eb7e6..fa8c2b41cbaf 100644
--- a/arch/x86/kernel/kprobes/core.c
+++ b/arch/x86/kernel/kprobes/core.c
@@ -54,6 +54,7 @@
#include <asm/insn.h>
#include <asm/debugreg.h>
#include <asm/ibt.h>
+#include <asm/cfi.h>
#include "common.h"
@@ -293,7 +294,40 @@ static int can_probe(unsigned long paddr)
#endif
addr += insn.length;
}
+ if (IS_ENABLED(CONFIG_CFI_CLANG)) {
+ /*
+ * The compiler generates the following instruction sequence
+ * for indirect call checks and cfi.c decodes this;
+ *
+ * movl -<id>, %r10d ; 6 bytes
+ * addl -4(%reg), %r10d ; 4 bytes
+ * je .Ltmp1 ; 2 bytes
+ * ud2 ; <- regs->ip
+ * .Ltmp1:
+ *
+ * Also, these movl and addl are used for showing expected
+ * type. So those must not be touched.
+ */
+ __addr = recover_probed_instruction(buf, addr);
+ if (!__addr)
+ return 0;
+
+ if (insn_decode_kernel(&insn, (void *)__addr) < 0)
+ return 0;
+
+ if (insn.opcode.value == 0xBA)
+ offset = 12;
+ else if (insn.opcode.value == 0x3)
+ offset = 6;
+ else
+ goto out;
+
+ /* This movl/addl is used for decoding CFI. */
+ if (is_cfi_trap(addr + offset))
+ return 0;
+ }
+out:
return (addr == paddr);
}
From: Masami Hiramatsu (Google) <[email protected]>
Do not allow to probe on "__cfi_" started symbol, because it includes
a typeid value in the code for CFI. Probing it will break the typeid
checking.
Signed-off-by: Masami Hiramatsu (Google) <[email protected]>
---
kernel/kprobes.c | 17 ++++++++++++++++-
1 file changed, 16 insertions(+), 1 deletion(-)
diff --git a/kernel/kprobes.c b/kernel/kprobes.c
index 00e177de91cc..ce2e460c1f79 100644
--- a/kernel/kprobes.c
+++ b/kernel/kprobes.c
@@ -1545,6 +1545,20 @@ static int check_ftrace_location(struct kprobe *p)
return 0;
}
+#ifdef CONFIG_CFI_CLANG
+static bool is_cfi_preamble_symbol(unsigned long addr)
+{
+ char symbuf[KSYM_NAME_LEN];
+
+ if (lookup_symbol_name(addr, symbuf))
+ return false;
+
+ return str_has_prefix("__cfi_", symbuf);
+}
+#else
+#define is_cfi_preamble_symbol(addr) (0)
+#endif
+
static int check_kprobe_address_safe(struct kprobe *p,
struct module **probed_mod)
{
@@ -1563,7 +1577,8 @@ static int check_kprobe_address_safe(struct kprobe *p,
within_kprobe_blacklist((unsigned long) p->addr) ||
jump_label_text_reserved(p->addr, p->addr) ||
static_call_text_reserved(p->addr, p->addr) ||
- find_bug((unsigned long)p->addr)) {
+ find_bug((unsigned long)p->addr) ||
+ is_cfi_preamble_symbol((unsigned long)p->addr)) {
ret = -EINVAL;
goto out;
}
On Mon, Jul 10, 2023 at 09:14:13PM +0900, Masami Hiramatsu (Google) wrote:
> I just build tested, since I could not boot the kernel with CFI_CLANG=y.
> Would anyone know something about this error?
>
> [ 0.141030] MMIO Stale Data: Unknown: No mitigations
> [ 0.153511] SMP alternatives: Using kCFI
> [ 0.164593] Freeing SMP alternatives memory: 36K
> [ 0.165053] Kernel panic - not syncing: stack-protector: Kernel stack is corrupted in: start_kernel+0x472/0x48b
> [ 0.166028] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 6.4.2-00002-g12b1b2fca8ef #126
> [ 0.166028] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1.1 04/01/2014
> [ 0.166028] Call Trace:
> [ 0.166028] <TASK>
> [ 0.166028] dump_stack_lvl+0x6e/0xb0
> [ 0.166028] panic+0x146/0x2f0
> [ 0.166028] ? start_kernel+0x472/0x48b
> [ 0.166028] __stack_chk_fail+0x14/0x20
> [ 0.166028] start_kernel+0x472/0x48b
> [ 0.166028] x86_64_start_reservations+0x24/0x30
> [ 0.166028] x86_64_start_kernel+0xa6/0xbb
> [ 0.166028] secondary_startup_64_no_verify+0x106/0x11b
> [ 0.166028] </TASK>
> [ 0.166028] ---[ end Kernel panic - not syncing: stack-protector: Kernel stack is corrupted in: start_kernel+0x472/0x48b ]---
>
>
Hmm, I just build v6.4 using defconfig+kvm_guest.config+CFI_CLANG using
clang-16 and that boots using kvm... (on my IVB, and the thing also
boots natively on my ADL).
I'll go have a look at your patches shortly.
On Mon, Jul 10, 2023 at 09:14:13PM +0900, Masami Hiramatsu (Google) wrote:
> I just build tested, since I could not boot the kernel with CFI_CLANG=y.
> Would anyone know something about this error?
>
> [ 0.141030] MMIO Stale Data: Unknown: No mitigations
> [ 0.153511] SMP alternatives: Using kCFI
> [ 0.164593] Freeing SMP alternatives memory: 36K
> [ 0.165053] Kernel panic - not syncing: stack-protector: Kernel stack is corrupted in: start_kernel+0x472/0x48b
> [ 0.166028] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 6.4.2-00002-g12b1b2fca8ef #126
> [ 0.166028] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1.1 04/01/2014
> [ 0.166028] Call Trace:
> [ 0.166028] <TASK>
> [ 0.166028] dump_stack_lvl+0x6e/0xb0
> [ 0.166028] panic+0x146/0x2f0
> [ 0.166028] ? start_kernel+0x472/0x48b
> [ 0.166028] __stack_chk_fail+0x14/0x20
> [ 0.166028] start_kernel+0x472/0x48b
> [ 0.166028] x86_64_start_reservations+0x24/0x30
> [ 0.166028] x86_64_start_kernel+0xa6/0xbb
> [ 0.166028] secondary_startup_64_no_verify+0x106/0x11b
> [ 0.166028] </TASK>
> [ 0.166028] ---[ end Kernel panic - not syncing: stack-protector: Kernel stack is corrupted in: start_kernel+0x472/0x48b ]---
This looks like https://github.com/ClangBuiltLinux/linux/issues/1815 to
me. What version of LLVM are you using? This was fixed in 16.0.4. Commit
514ca14ed544 ("start_kernel: Add __no_stack_protector function
attribute") should resolve it on the Linux side, it looks like that is
in 6.5-rc1. Not sure if we should backport it or just let people upgrade
their toolchains on older releases.
Cheers,
Nathan
On Mon, Jul 10, 2023 at 09:14:24PM +0900, Masami Hiramatsu (Google) wrote:
> +#ifdef CONFIG_CFI_CLANG
> +static bool is_cfi_preamble_symbol(unsigned long addr)
> +{
> + char symbuf[KSYM_NAME_LEN];
> +
> + if (lookup_symbol_name(addr, symbuf))
> + return false;
> +
> + return str_has_prefix("__cfi_", symbuf)
|| str_has_prefix("__pfx_", symbol);
The __pfx_ symbols can happen when !CFI_CLANG but still having
FUNCTION_PADDING_BYTES.
> +}
> +#else
> +#define is_cfi_preamble_symbol(addr) (0)
> +#endif
As such I think we can do the above unconditionally, without either
there should not be any matching symbols.
On Mon, Jul 10, 2023 at 09:14:35PM +0900, Masami Hiramatsu (Google) wrote:
> + if (IS_ENABLED(CONFIG_CFI_CLANG)) {
> + /*
> + * The compiler generates the following instruction sequence
> + * for indirect call checks and cfi.c decodes this;
> + *
> + *? movl -<id>, %r10d ; 6 bytes
> + * addl -4(%reg), %r10d ; 4 bytes
> + * je .Ltmp1 ; 2 bytes
> + * ud2 ; <- regs->ip
> + * .Ltmp1:
> + *
> + * Also, these movl and addl are used for showing expected
> + * type. So those must not be touched.
> + */
> + __addr = recover_probed_instruction(buf, addr);
> + if (!__addr)
> + return 0;
> +
> + if (insn_decode_kernel(&insn, (void *)__addr) < 0)
> + return 0;
> +
> + if (insn.opcode.value == 0xBA)
> + offset = 12;
> + else if (insn.opcode.value == 0x3)
> + offset = 6;
> + else
> + goto out;
Notably, the JE will already be avoided?
> +
> + /* This movl/addl is used for decoding CFI. */
> + if (is_cfi_trap(addr + offset))
> + return 0;
> + }
>
> +out:
> return (addr == paddr);
> }
Hmm, so I was thinking something like the below, which also catches
things when we rewrite kCFI to FineIBT, except I don't think we care if
the FineIBT callsite gets re-written. FineIBT only relies on the __cfi_
symbol not getting poked at (which the previous patches should ensure).
Additionally is_cfi_trap() is horrifically slow -- it's a full linear
search of the entire kcfi_traps array, it doesn't have any smarts on
(#UD can hardly be considered a fast path).
So I tihnk I'm ok with the above, just adding the below for reference
(completely untested and everything).
Acked-by: Peter Zijlstra (Intel) <[email protected]>
---
diff --git a/arch/x86/kernel/kprobes/core.c b/arch/x86/kernel/kprobes/core.c
index f7f6042eb7e6..b812dee76909 100644
--- a/arch/x86/kernel/kprobes/core.c
+++ b/arch/x86/kernel/kprobes/core.c
@@ -293,6 +293,11 @@ static int can_probe(unsigned long paddr)
#endif
addr += insn.length;
}
+ /*
+ * Don't allow poking the kCFI/FineIBT callsites.
+ */
+ if (IS_ENABLED(CONFIG_CFI_CLANG) && cfi_call_site(addr))
+ return 0;
return (addr == paddr);
}
diff --git a/kernel/cfi.c b/kernel/cfi.c
index 08caad776717..2656e6ffa013 100644
--- a/kernel/cfi.c
+++ b/kernel/cfi.c
@@ -31,16 +31,22 @@ static inline unsigned long trap_address(s32 *p)
return (unsigned long)((long)p + (long)*p);
}
-static bool is_trap(unsigned long addr, s32 *start, s32 *end)
+static long cfi_trap_distance(unsigned long addr, s32 *start, s32 *end)
{
+ long dist = LONG_MAX;
s32 *p;
for (p = start; p < end; ++p) {
- if (trap_address(p) == addr)
- return true;
+ long d = trap_address(p) - addr;
+
+ if (abs(dist) < abs(d)) {
+ dist = d;
+ if (dist == 0)
+ return 0;
+ }
}
- return false;
+ return dist;
}
#ifdef CONFIG_MODULES
@@ -66,25 +72,25 @@ void module_cfi_finalize(const Elf_Ehdr *hdr, const Elf_Shdr *sechdrs,
}
}
-static bool is_module_cfi_trap(unsigned long addr)
+static long module_cfi_trap_distance(unsigned long addr)
{
struct module *mod;
- bool found = false;
+ long dist = LONG_MAX;
rcu_read_lock_sched_notrace();
mod = __module_address(addr);
if (mod)
- found = is_trap(addr, mod->kcfi_traps, mod->kcfi_traps_end);
+ dist = cfi_trap_distance(addr, mod->kcfi_traps, mod->kcfi_traps_end);
rcu_read_unlock_sched_notrace();
return found;
}
#else /* CONFIG_MODULES */
-static inline bool is_module_cfi_trap(unsigned long addr)
+static long module_cfi_trap_distance(unsigned long addr)
{
- return false;
+ return LONG_MAX;
}
#endif /* CONFIG_MODULES */
@@ -93,9 +99,24 @@ extern s32 __stop___kcfi_traps[];
bool is_cfi_trap(unsigned long addr)
{
- if (is_trap(addr, __start___kcfi_traps, __stop___kcfi_traps))
+ long dist = cfi_trap_distance(addr, __start___kcfi_traps, __stop___kcfi_traps);
+ if (!dist)
+ return true;
+
+ return module_cfi_trap_distance(addr) == 0;
+}
+
+bool cfi_call_site(unsigned long addr)
+{
+ long dist = cfi_trap_distance(addr, __start___kcfi_traps, __stop___kcfi_traps);
+ if (dist >= -12 && dist <= 0)
+ return true;
+
+ dist = module_cfi_trap_distance(addr);
+ if (dist >= -12 && dist <= 0)
return true;
- return is_module_cfi_trap(addr);
+ return false;
}
+
#endif /* CONFIG_ARCH_USES_CFI_TRAPS */
On Mon, Jul 10, 2023 at 06:16:43PM +0200, Peter Zijlstra wrote:
> diff --git a/kernel/cfi.c b/kernel/cfi.c
> index 08caad776717..2656e6ffa013 100644
> --- a/kernel/cfi.c
> +++ b/kernel/cfi.c
> @@ -31,16 +31,22 @@ static inline unsigned long trap_address(s32 *p)
> return (unsigned long)((long)p + (long)*p);
> }
>
> -static bool is_trap(unsigned long addr, s32 *start, s32 *end)
> +static long cfi_trap_distance(unsigned long addr, s32 *start, s32 *end)
> {
> + long dist = LONG_MAX;
> s32 *p;
>
> for (p = start; p < end; ++p) {
> - if (trap_address(p) == addr)
> - return true;
> + long d = trap_address(p) - addr;
> +
> + if (abs(dist) < abs(d)) {
Not that I expect anybody will care, but that should obviously be:
abs(d) < abs(dist)
> + dist = d;
> + if (dist == 0)
> + return 0;
> + }
> }
>
> - return false;
> + return dist;
> }
On Mon, 10 Jul 2023 17:37:24 +0200
Peter Zijlstra <[email protected]> wrote:
> On Mon, Jul 10, 2023 at 09:14:24PM +0900, Masami Hiramatsu (Google) wrote:
>
>
> > +#ifdef CONFIG_CFI_CLANG
> > +static bool is_cfi_preamble_symbol(unsigned long addr)
> > +{
> > + char symbuf[KSYM_NAME_LEN];
> > +
> > + if (lookup_symbol_name(addr, symbuf))
> > + return false;
> > +
> > + return str_has_prefix("__cfi_", symbuf)
> || str_has_prefix("__pfx_", symbol);
>
> The __pfx_ symbols can happen when !CFI_CLANG but still having
> FUNCTION_PADDING_BYTES.
Indeed. Currently __pfx is not probed via tracefs interface because it is
notrace function but kprobe itself should also prohibit that.
>
> > +}
> > +#else
> > +#define is_cfi_preamble_symbol(addr) (0)
> > +#endif
>
> As such I think we can do the above unconditionally, without either
> there should not be any matching symbols.
OK.
Thank you!
>
--
Masami Hiramatsu (Google) <[email protected]>
On Mon, 10 Jul 2023 18:16:43 +0200
Peter Zijlstra <[email protected]> wrote:
> On Mon, Jul 10, 2023 at 09:14:35PM +0900, Masami Hiramatsu (Google) wrote:
> > + if (IS_ENABLED(CONFIG_CFI_CLANG)) {
> > + /*
> > + * The compiler generates the following instruction sequence
> > + * for indirect call checks and cfi.c decodes this;
> > + *
> > + * movl -<id>, %r10d ; 6 bytes
> > + * addl -4(%reg), %r10d ; 4 bytes
> > + * je .Ltmp1 ; 2 bytes
> > + * ud2 ; <- regs->ip
> > + * .Ltmp1:
> > + *
> > + * Also, these movl and addl are used for showing expected
> > + * type. So those must not be touched.
> > + */
> > + __addr = recover_probed_instruction(buf, addr);
> > + if (!__addr)
> > + return 0;
> > +
> > + if (insn_decode_kernel(&insn, (void *)__addr) < 0)
> > + return 0;
> > +
> > + if (insn.opcode.value == 0xBA)
> > + offset = 12;
> > + else if (insn.opcode.value == 0x3)
> > + offset = 6;
> > + else
> > + goto out;
>
> Notably, the JE will already be avoided?
>
> > +
> > + /* This movl/addl is used for decoding CFI. */
> > + if (is_cfi_trap(addr + offset))
> > + return 0;
> > + }
> >
> > +out:
> > return (addr == paddr);
> > }
>
> Hmm, so I was thinking something like the below, which also catches
> things when we rewrite kCFI to FineIBT, except I don't think we care if
> the FineIBT callsite gets re-written. FineIBT only relies on the __cfi_
> symbol not getting poked at (which the previous patches should ensure).
Oh, is FineIBT different from kCFI? I thought those are same. But anyway
for kCFI=y && FineIBT=n case, I think this code still needed.
>
> Additionally is_cfi_trap() is horrifically slow -- it's a full linear
> search of the entire kcfi_traps array, it doesn't have any smarts on
> (#UD can hardly be considered a fast path).
Yeah, register_kprobe() is not a fast path in most cases. So I think
this is OK at this point. We can speed it up by sorting the array by
address and binary search later.
>
> So I tihnk I'm ok with the above, just adding the below for reference
> (completely untested and everything).
I wonder the distance can be used outside of x86. CFI implementation
depends on the arch?
>
>
> Acked-by: Peter Zijlstra (Intel) <[email protected]>
Thanks!
>
>
> ---
> diff --git a/arch/x86/kernel/kprobes/core.c b/arch/x86/kernel/kprobes/core.c
> index f7f6042eb7e6..b812dee76909 100644
> --- a/arch/x86/kernel/kprobes/core.c
> +++ b/arch/x86/kernel/kprobes/core.c
> @@ -293,6 +293,11 @@ static int can_probe(unsigned long paddr)
> #endif
> addr += insn.length;
> }
> + /*
> + * Don't allow poking the kCFI/FineIBT callsites.
> + */
> + if (IS_ENABLED(CONFIG_CFI_CLANG) && cfi_call_site(addr))
> + return 0;
>
> return (addr == paddr);
> }
> diff --git a/kernel/cfi.c b/kernel/cfi.c
> index 08caad776717..2656e6ffa013 100644
> --- a/kernel/cfi.c
> +++ b/kernel/cfi.c
> @@ -31,16 +31,22 @@ static inline unsigned long trap_address(s32 *p)
> return (unsigned long)((long)p + (long)*p);
> }
>
> -static bool is_trap(unsigned long addr, s32 *start, s32 *end)
> +static long cfi_trap_distance(unsigned long addr, s32 *start, s32 *end)
> {
> + long dist = LONG_MAX;
> s32 *p;
>
> for (p = start; p < end; ++p) {
> - if (trap_address(p) == addr)
> - return true;
> + long d = trap_address(p) - addr;
> +
> + if (abs(dist) < abs(d)) {
> + dist = d;
> + if (dist == 0)
> + return 0;
> + }
> }
>
> - return false;
> + return dist;
> }
>
> #ifdef CONFIG_MODULES
> @@ -66,25 +72,25 @@ void module_cfi_finalize(const Elf_Ehdr *hdr, const Elf_Shdr *sechdrs,
> }
> }
>
> -static bool is_module_cfi_trap(unsigned long addr)
> +static long module_cfi_trap_distance(unsigned long addr)
> {
> struct module *mod;
> - bool found = false;
> + long dist = LONG_MAX;
>
> rcu_read_lock_sched_notrace();
>
> mod = __module_address(addr);
> if (mod)
> - found = is_trap(addr, mod->kcfi_traps, mod->kcfi_traps_end);
> + dist = cfi_trap_distance(addr, mod->kcfi_traps, mod->kcfi_traps_end);
>
> rcu_read_unlock_sched_notrace();
>
> return found;
> }
> #else /* CONFIG_MODULES */
> -static inline bool is_module_cfi_trap(unsigned long addr)
> +static long module_cfi_trap_distance(unsigned long addr)
> {
> - return false;
> + return LONG_MAX;
> }
> #endif /* CONFIG_MODULES */
>
> @@ -93,9 +99,24 @@ extern s32 __stop___kcfi_traps[];
>
> bool is_cfi_trap(unsigned long addr)
> {
> - if (is_trap(addr, __start___kcfi_traps, __stop___kcfi_traps))
> + long dist = cfi_trap_distance(addr, __start___kcfi_traps, __stop___kcfi_traps);
> + if (!dist)
> + return true;
> +
> + return module_cfi_trap_distance(addr) == 0;
> +}
> +
> +bool cfi_call_site(unsigned long addr)
> +{
> + long dist = cfi_trap_distance(addr, __start___kcfi_traps, __stop___kcfi_traps);
> + if (dist >= -12 && dist <= 0)
> + return true;
> +
> + dist = module_cfi_trap_distance(addr);
> + if (dist >= -12 && dist <= 0)
> return true;
>
> - return is_module_cfi_trap(addr);
> + return false;
> }
> +
> #endif /* CONFIG_ARCH_USES_CFI_TRAPS */
--
Masami Hiramatsu (Google) <[email protected]>
On Mon, 10 Jul 2023 08:57:03 -0700
Nathan Chancellor <[email protected]> wrote:
> On Mon, Jul 10, 2023 at 09:14:13PM +0900, Masami Hiramatsu (Google) wrote:
> > I just build tested, since I could not boot the kernel with CFI_CLANG=y.
> > Would anyone know something about this error?
> >
> > [ 0.141030] MMIO Stale Data: Unknown: No mitigations
> > [ 0.153511] SMP alternatives: Using kCFI
> > [ 0.164593] Freeing SMP alternatives memory: 36K
> > [ 0.165053] Kernel panic - not syncing: stack-protector: Kernel stack is corrupted in: start_kernel+0x472/0x48b
> > [ 0.166028] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 6.4.2-00002-g12b1b2fca8ef #126
> > [ 0.166028] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1.1 04/01/2014
> > [ 0.166028] Call Trace:
> > [ 0.166028] <TASK>
> > [ 0.166028] dump_stack_lvl+0x6e/0xb0
> > [ 0.166028] panic+0x146/0x2f0
> > [ 0.166028] ? start_kernel+0x472/0x48b
> > [ 0.166028] __stack_chk_fail+0x14/0x20
> > [ 0.166028] start_kernel+0x472/0x48b
> > [ 0.166028] x86_64_start_reservations+0x24/0x30
> > [ 0.166028] x86_64_start_kernel+0xa6/0xbb
> > [ 0.166028] secondary_startup_64_no_verify+0x106/0x11b
> > [ 0.166028] </TASK>
> > [ 0.166028] ---[ end Kernel panic - not syncing: stack-protector: Kernel stack is corrupted in: start_kernel+0x472/0x48b ]---
>
> This looks like https://github.com/ClangBuiltLinux/linux/issues/1815 to
> me. What version of LLVM are you using? This was fixed in 16.0.4. Commit
> 514ca14ed544 ("start_kernel: Add __no_stack_protector function
> attribute") should resolve it on the Linux side, it looks like that is
> in 6.5-rc1. Not sure if we should backport it or just let people upgrade
> their toolchains on older releases.
Thanks for the info. I confirmed that the commit fixed the boot issue.
So I think it should be backported to the stable tree.
Thanks!
>
> Cheers,
> Nathan
--
Masami Hiramatsu (Google) <[email protected]>
On Tue, Jul 11, 2023 at 08:58:37AM +0900, Masami Hiramatsu wrote:
> > Hmm, so I was thinking something like the below, which also catches
> > things when we rewrite kCFI to FineIBT, except I don't think we care if
> > the FineIBT callsite gets re-written. FineIBT only relies on the __cfi_
> > symbol not getting poked at (which the previous patches should ensure).
>
> Oh, is FineIBT different from kCFI? I thought those are same. But anyway
> for kCFI=y && FineIBT=n case, I think this code still needed.
Ah, I forgot to answer your other question; FineIBT is dynamically
patched since it relies on optional hardware features, only if the
hardware has IBT support can FineIBT work.
So yeah, your check is definitely needed. And I think it'll 'gracefully'
fail when FineIBT is patched in because the instructions aren't exactly
the same.
And since FineIBT has most of the magic in the __cfi_ prefix, which
isn't patched per the previous patch, I think we're good on the call-site.
> > So I tihnk I'm ok with the above, just adding the below for reference
> > (completely untested and everything).
>
> I wonder the distance can be used outside of x86. CFI implementation
> depends on the arch?
Currently only arm64 and x86_64 have CFI, and while the particulars are a
little different, they do have a lot in common, including the reporting.
IIRC the arm64 variant is a little simpler because of fixed instruction
size. There is no worry someone will jump in the middle of an
instruction stream.
On Tue, Jul 11, 2023 at 08:58:37AM +0900, Masami Hiramatsu wrote:
> Oh, is FineIBT different from kCFI? I thought those are same. But anyway
> for kCFI=y && FineIBT=n case, I think this code still needed.
Yeah, FineIBT relies on kCFI and IBT (and selects CALL_PADDING) and
dynamically re-writes the kCFI infrastructure.
All the gory details are in arch/x86/kernel/alternative.c, search for
CONFIG_FINEIBT, I've tried to write a big comment, but please let me
know if something isn't clear and I'll write some actual words :-).
But basically kCFI is a pure software solution and does the check before
the indirect control transfer (it has to). While FineIBT relies on the
IBT hardware in order to do the check after.
As such, FineIBT can do the CFI check without memory loads, which is
good for performance. Another useful property of FineIBT is that it is a
speculation barrier.
Masami, thanks for verifying!
Hi Greg and Sasha,
On Tue, Jul 11, 2023 at 10:33:03AM +0900, Masami Hiramatsu wrote:
> On Mon, 10 Jul 2023 08:57:03 -0700
> Nathan Chancellor <[email protected]> wrote:
>
> > On Mon, Jul 10, 2023 at 09:14:13PM +0900, Masami Hiramatsu (Google) wrote:
> > > I just build tested, since I could not boot the kernel with CFI_CLANG=y.
> > > Would anyone know something about this error?
> > >
> > > [ 0.141030] MMIO Stale Data: Unknown: No mitigations
> > > [ 0.153511] SMP alternatives: Using kCFI
> > > [ 0.164593] Freeing SMP alternatives memory: 36K
> > > [ 0.165053] Kernel panic - not syncing: stack-protector: Kernel stack is corrupted in: start_kernel+0x472/0x48b
> > > [ 0.166028] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 6.4.2-00002-g12b1b2fca8ef #126
> > > [ 0.166028] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1.1 04/01/2014
> > > [ 0.166028] Call Trace:
> > > [ 0.166028] <TASK>
> > > [ 0.166028] dump_stack_lvl+0x6e/0xb0
> > > [ 0.166028] panic+0x146/0x2f0
> > > [ 0.166028] ? start_kernel+0x472/0x48b
> > > [ 0.166028] __stack_chk_fail+0x14/0x20
> > > [ 0.166028] start_kernel+0x472/0x48b
> > > [ 0.166028] x86_64_start_reservations+0x24/0x30
> > > [ 0.166028] x86_64_start_kernel+0xa6/0xbb
> > > [ 0.166028] secondary_startup_64_no_verify+0x106/0x11b
> > > [ 0.166028] </TASK>
> > > [ 0.166028] ---[ end Kernel panic - not syncing: stack-protector: Kernel stack is corrupted in: start_kernel+0x472/0x48b ]---
> >
> > This looks like https://github.com/ClangBuiltLinux/linux/issues/1815 to
> > me. What version of LLVM are you using? This was fixed in 16.0.4. Commit
> > 514ca14ed544 ("start_kernel: Add __no_stack_protector function
> > attribute") should resolve it on the Linux side, it looks like that is
> > in 6.5-rc1. Not sure if we should backport it or just let people upgrade
> > their toolchains on older releases.
>
> Thanks for the info. I confirmed that the commit fixed the boot issue.
> So I think it should be backported to the stable tree.
Would you please apply commit 514ca14ed544 ("start_kernel: Add
__no_stack_protector function attribute") to linux-6.4.y? The series
ending with commit 611d4c716db0 ("x86/hyperv: Mark hv_ghcb_terminate()
as noreturn") that shipped in 6.4 exposes an LLVM issue that affected
16.0.0 and 16.0.1, which was resolved in 16.0.2. When using those
affected LLVM releases, the following crash at boot occurs:
[ 0.181667] Kernel panic - not syncing: stack-protector: Kernel stack is corrupted in: start_kernel+0x3cf/0x3d0
[ 0.182621] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 6.4.3 #1
[ 0.182621] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.2-0-gea1b7a073390-prebuilt.qemu.org 04/01/2014
[ 0.182621] Call Trace:
[ 0.182621] <TASK>
[ 0.182621] dump_stack_lvl+0x6a/0xa0
[ 0.182621] panic+0x124/0x2f0
[ 0.182621] ? start_kernel+0x3cf/0x3d0
[ 0.182621] ? acpi_enable+0x64/0xc0
[ 0.182621] __stack_chk_fail+0x14/0x20
[ 0.182621] start_kernel+0x3cf/0x3d0
[ 0.182621] x86_64_start_reservations+0x24/0x30
[ 0.182621] x86_64_start_kernel+0xab/0xb0
[ 0.182621] secondary_startup_64_no_verify+0x107/0x10b
[ 0.182621] </TASK>
[ 0.182621] ---[ end Kernel panic - not syncing: stack-protector: Kernel stack is corrupted in: start_kernel+0x3cf/0x3d0 ]---
514ca14ed544 aims to avoid this on the Linux side. I have verified that
it applies to 6.4.3 cleanly and resolves the issue there, as has Masami.
If there are any issues or questions, please let me know.
Cheers,
Nathan
On Tue, Jul 11, 2023 at 11:37:04AM -0700, Nathan Chancellor wrote:
> Masami, thanks for verifying!
>
> Hi Greg and Sasha,
>
> On Tue, Jul 11, 2023 at 10:33:03AM +0900, Masami Hiramatsu wrote:
> > On Mon, 10 Jul 2023 08:57:03 -0700
> > Nathan Chancellor <[email protected]> wrote:
> >
> > > On Mon, Jul 10, 2023 at 09:14:13PM +0900, Masami Hiramatsu (Google) wrote:
> > > > I just build tested, since I could not boot the kernel with CFI_CLANG=y.
> > > > Would anyone know something about this error?
> > > >
> > > > [ 0.141030] MMIO Stale Data: Unknown: No mitigations
> > > > [ 0.153511] SMP alternatives: Using kCFI
> > > > [ 0.164593] Freeing SMP alternatives memory: 36K
> > > > [ 0.165053] Kernel panic - not syncing: stack-protector: Kernel stack is corrupted in: start_kernel+0x472/0x48b
> > > > [ 0.166028] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 6.4.2-00002-g12b1b2fca8ef #126
> > > > [ 0.166028] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1.1 04/01/2014
> > > > [ 0.166028] Call Trace:
> > > > [ 0.166028] <TASK>
> > > > [ 0.166028] dump_stack_lvl+0x6e/0xb0
> > > > [ 0.166028] panic+0x146/0x2f0
> > > > [ 0.166028] ? start_kernel+0x472/0x48b
> > > > [ 0.166028] __stack_chk_fail+0x14/0x20
> > > > [ 0.166028] start_kernel+0x472/0x48b
> > > > [ 0.166028] x86_64_start_reservations+0x24/0x30
> > > > [ 0.166028] x86_64_start_kernel+0xa6/0xbb
> > > > [ 0.166028] secondary_startup_64_no_verify+0x106/0x11b
> > > > [ 0.166028] </TASK>
> > > > [ 0.166028] ---[ end Kernel panic - not syncing: stack-protector: Kernel stack is corrupted in: start_kernel+0x472/0x48b ]---
> > >
> > > This looks like https://github.com/ClangBuiltLinux/linux/issues/1815 to
> > > me. What version of LLVM are you using? This was fixed in 16.0.4. Commit
> > > 514ca14ed544 ("start_kernel: Add __no_stack_protector function
> > > attribute") should resolve it on the Linux side, it looks like that is
> > > in 6.5-rc1. Not sure if we should backport it or just let people upgrade
> > > their toolchains on older releases.
> >
> > Thanks for the info. I confirmed that the commit fixed the boot issue.
> > So I think it should be backported to the stable tree.
>
> Would you please apply commit 514ca14ed544 ("start_kernel: Add
> __no_stack_protector function attribute") to linux-6.4.y? The series
> ending with commit 611d4c716db0 ("x86/hyperv: Mark hv_ghcb_terminate()
> as noreturn") that shipped in 6.4 exposes an LLVM issue that affected
> 16.0.0 and 16.0.1, which was resolved in 16.0.2. When using those
> affected LLVM releases, the following crash at boot occurs:
>
> [ 0.181667] Kernel panic - not syncing: stack-protector: Kernel stack is corrupted in: start_kernel+0x3cf/0x3d0
> [ 0.182621] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 6.4.3 #1
> [ 0.182621] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.2-0-gea1b7a073390-prebuilt.qemu.org 04/01/2014
> [ 0.182621] Call Trace:
> [ 0.182621] <TASK>
> [ 0.182621] dump_stack_lvl+0x6a/0xa0
> [ 0.182621] panic+0x124/0x2f0
> [ 0.182621] ? start_kernel+0x3cf/0x3d0
> [ 0.182621] ? acpi_enable+0x64/0xc0
> [ 0.182621] __stack_chk_fail+0x14/0x20
> [ 0.182621] start_kernel+0x3cf/0x3d0
> [ 0.182621] x86_64_start_reservations+0x24/0x30
> [ 0.182621] x86_64_start_kernel+0xab/0xb0
> [ 0.182621] secondary_startup_64_no_verify+0x107/0x10b
> [ 0.182621] </TASK>
> [ 0.182621] ---[ end Kernel panic - not syncing: stack-protector: Kernel stack is corrupted in: start_kernel+0x3cf/0x3d0 ]---
>
> 514ca14ed544 aims to avoid this on the Linux side. I have verified that
> it applies to 6.4.3 cleanly and resolves the issue there, as has Masami.
>
> If there are any issues or questions, please let me know.
Now queued up, thanks.
greg k-h