The change fixes boot failure on physical machine where kernel
is built with gcc-10 with stack-protector enabled by default:
```
Kernel panic — not syncing: stack-protector: Kernel stack is corrupted in: start_secondary+0x191/0x1a0
CPU: 1 PID: 0 Comm: swapper/1 Not tainted 5.6.0-rc5—00235—gfffb08b37df9 #139
Hardware name: Gigabyte Technology Co., Ltd. To be filled by O.E.M./H77M—D3H, BIOS F12 11/14/2013
Call Trace:
dump_stack+0x71/0xa0
panic+0x107/0x2b8
? start_secondary+0x191/0x1a0
__stack_chk_fail+0x15/0x20
start_secondary+0x191/0x1a0
secondary_startup_64+0xa4/0xb0
-—-[ end Kernel panic — not syncing: stack—protector: Kernel stack is corrupted in: start_secondary+0x191
```
This happens because `start_secondary()` is responsible for setting
up initial stack canary value in `smpboot.c`, but nothing prevents
gcc from inserting stack canary into `start_secondary()` itself
before `boot_init_stack_canary()` call.
The fix passes `-fno-stack-protector` to avoid any early stack
checks in `start_secondary()` or inlined functions into it.
Tested the change by successfully booting the machine.
A few similar crashes on VMs:
- https://bugzilla.redhat.com/show_bug.cgi?id=1796780
- http://rglinuxtech.com/?p=2694
CC: Jakub Jelinek <[email protected]>
CC: Thomas Gleixner <[email protected]>
CC: Ingo Molnar <[email protected]>
CC: Borislav Petkov <[email protected]>
CC: "H. Peter Anvin" <[email protected]>
CC: Andy Lutomirski <[email protected]>
CC: [email protected]
Signed-off-by: Sergei Trofimovich <[email protected]>
---
arch/x86/kernel/Makefile | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/arch/x86/kernel/Makefile b/arch/x86/kernel/Makefile
index 9b294c13809a..da9f4ea9bf4c 100644
--- a/arch/x86/kernel/Makefile
+++ b/arch/x86/kernel/Makefile
@@ -11,6 +11,12 @@ extra-y += vmlinux.lds
CPPFLAGS_vmlinux.lds += -U$(UTS_MACHINE)
+# smpboot's init_secondary initializes stack canary.
+# Make sure we don't emit stack checks before it's
+# initialized.
+nostackp := $(call cc-option, -fno-stack-protector)
+CFLAGS_smpboot.o := $(nostackp)
+
ifdef CONFIG_FUNCTION_TRACER
# Do not profile debug and lowlevel utilities
CFLAGS_REMOVE_tsc.o = -pg
--
2.25.1
On Sat, Mar 14, 2020 at 04:44:51PM +0000, Sergei Trofimovich wrote:
> The change fixes boot failure on physical machine where kernel
> is built with gcc-10 with stack-protector enabled by default:
> This happens because `start_secondary()` is responsible for setting
> up initial stack canary value in `smpboot.c`, but nothing prevents
> gcc from inserting stack canary into `start_secondary()` itself
> before `boot_init_stack_canary()` call.
> diff --git a/arch/x86/kernel/Makefile b/arch/x86/kernel/Makefile
> index 9b294c13809a..da9f4ea9bf4c 100644
> --- a/arch/x86/kernel/Makefile
> +++ b/arch/x86/kernel/Makefile
> @@ -11,6 +11,12 @@ extra-y += vmlinux.lds
>
> CPPFLAGS_vmlinux.lds += -U$(UTS_MACHINE)
>
> +# smpboot's init_secondary initializes stack canary.
> +# Make sure we don't emit stack checks before it's
> +# initialized.
> +nostackp := $(call cc-option, -fno-stack-protector)
> +CFLAGS_smpboot.o := $(nostackp)
What makes GCC10 insert this while GCC9 does not. Also, I would much
rather GCC10 add a function attrbute to kill this:
__attribute__((no_stack_protect))
Then we can explicitly clear this one function and keep it on for the
rest of the file.
On Mon, Mar 16, 2020 at 02:26:48PM +0100, Jakub Jelinek wrote:
> On Mon, Mar 16, 2020 at 02:04:14PM +0100, Peter Zijlstra wrote:
> > > diff --git a/arch/x86/kernel/Makefile b/arch/x86/kernel/Makefile
> > > index 9b294c13809a..da9f4ea9bf4c 100644
> > > --- a/arch/x86/kernel/Makefile
> > > +++ b/arch/x86/kernel/Makefile
> > > @@ -11,6 +11,12 @@ extra-y += vmlinux.lds
> > >
> > > CPPFLAGS_vmlinux.lds += -U$(UTS_MACHINE)
> > >
> > > +# smpboot's init_secondary initializes stack canary.
> > > +# Make sure we don't emit stack checks before it's
> > > +# initialized.
> > > +nostackp := $(call cc-option, -fno-stack-protector)
> > > +CFLAGS_smpboot.o := $(nostackp)
> >
> > What makes GCC10 insert this while GCC9 does not. Also, I would much
>
> My bet is different inlining decisions.
> If somebody hands me over the preprocessed source + gcc command line, I can
> have a look in detail (which exact change and why).
>
> > rather GCC10 add a function attrbute to kill this:
> >
> > __attribute__((no_stack_protect))
>
> There is no such attribute,
Right I know, I looked for it recently :/ But since this is new in 10
and 10 isn't released yet, I figured someone can add the attribute
before it does get released.
> only __attribute__((stack_protect)) which is
> meant mainly for -fstack-protector-explicit and does the opposite, or
> __attribute__((optimize ("no-stack-protector"))) (which will work only
> in GCC7+, since https://gcc.gnu.org/PR71585 changes).
Cute..
> Or of course you could add noinline attribute to whatever got inlined
> and contains some array or addressable variable that whatever
> -fstack-protector* mode kernel uses triggers it. With -fstack-protector-all
> it would never work even in the past I believe.
I don't think the kernel supports -fstack-protector-all, but I could be
mistaken.
On Mon, Mar 16, 2020 at 02:04:14PM +0100, Peter Zijlstra wrote:
> > diff --git a/arch/x86/kernel/Makefile b/arch/x86/kernel/Makefile
> > index 9b294c13809a..da9f4ea9bf4c 100644
> > --- a/arch/x86/kernel/Makefile
> > +++ b/arch/x86/kernel/Makefile
> > @@ -11,6 +11,12 @@ extra-y += vmlinux.lds
> >
> > CPPFLAGS_vmlinux.lds += -U$(UTS_MACHINE)
> >
> > +# smpboot's init_secondary initializes stack canary.
> > +# Make sure we don't emit stack checks before it's
> > +# initialized.
> > +nostackp := $(call cc-option, -fno-stack-protector)
> > +CFLAGS_smpboot.o := $(nostackp)
>
> What makes GCC10 insert this while GCC9 does not. Also, I would much
My bet is different inlining decisions.
If somebody hands me over the preprocessed source + gcc command line, I can
have a look in detail (which exact change and why).
> rather GCC10 add a function attrbute to kill this:
>
> __attribute__((no_stack_protect))
There is no such attribute, only __attribute__((stack_protect)) which is
meant mainly for -fstack-protector-explicit and does the opposite, or
__attribute__((optimize ("no-stack-protector"))) (which will work only
in GCC7+, since https://gcc.gnu.org/PR71585 changes).
Or of course you could add noinline attribute to whatever got inlined
and contains some array or addressable variable that whatever
-fstack-protector* mode kernel uses triggers it. With -fstack-protector-all
it would never work even in the past I believe.
Jakub
On Mon, Mar 16, 2020 at 02:42:34PM +0100, Peter Zijlstra wrote:
> Right I know, I looked for it recently :/ But since this is new in 10
> and 10 isn't released yet, I figured someone can add the attribute
> before it does get released.
Yes, that would be a good solution.
I looked at what happens briefly after building gcc10 from git and IINM,
the function in question - start_secondary() - already gets the stack
canary asm glue added so it checks for a stack canary.
However, the stack canary value itself gets set later in that same
function:
/* to prevent fake stack check failure in clock setup */
boot_init_stack_canary();
so the asm glue which checks for it would need to reload the newly
computed canary value (it is 0 before we compute it and thus the
mismatch).
So having a way to state "do not add stack canary checking to this
particular function" would be optimal. And since you already have the
"stack_protect" function attribute I figure adding a "no_stack_protect"
one should be easy...
> > Or of course you could add noinline attribute to whatever got inlined
> > and contains some array or addressable variable that whatever
> > -fstack-protector* mode kernel uses triggers it. With -fstack-protector-all
> > it would never work even in the past I believe.
>
> I don't think the kernel supports -fstack-protector-all, but I could be
> mistaken.
The other thing I was thinking was to carve out only that function into
a separate compilation unit and disable stack protector only for it.
All IMHO of course.
Thx.
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
On Mon, Mar 16, 2020 at 06:54:50PM +0100, Borislav Petkov wrote:
> On Mon, Mar 16, 2020 at 02:42:34PM +0100, Peter Zijlstra wrote:
> > Right I know, I looked for it recently :/ But since this is new in 10
> > and 10 isn't released yet, I figured someone can add the attribute
> > before it does get released.
>
> Yes, that would be a good solution.
>
> I looked at what happens briefly after building gcc10 from git and IINM,
> the function in question - start_secondary() - already gets the stack
> canary asm glue added so it checks for a stack canary.
>
> However, the stack canary value itself gets set later in that same
> function:
>
> /* to prevent fake stack check failure in clock setup */
> boot_init_stack_canary();
>
> so the asm glue which checks for it would need to reload the newly
> computed canary value (it is 0 before we compute it and thus the
> mismatch).
>
> So having a way to state "do not add stack canary checking to this
> particular function" would be optimal. And since you already have the
> "stack_protect" function attribute I figure adding a "no_stack_protect"
> one should be easy...
>
> > > Or of course you could add noinline attribute to whatever got inlined
> > > and contains some array or addressable variable that whatever
> > > -fstack-protector* mode kernel uses triggers it. With -fstack-protector-all
> > > it would never work even in the past I believe.
> >
> > I don't think the kernel supports -fstack-protector-all, but I could be
> > mistaken.
>
> The other thing I was thinking was to carve out only that function into
> a separate compilation unit and disable stack protector only for it.
>
> All IMHO of course.
>
> Thx.
>
> --
> Regards/Gruss,
> Boris.
>
> https://people.kernel.org/tglx/notes-about-netiquette
With STACKPROTECTOR_STRONG, gcc9 (at least gentoo's version, not sure if
they have some patches that affect it) already adds stack canary into
start_secondary. Not sure why it doesn't panic already with gcc9?
00000000000008f0 <start_secondary>:
8f0: 53 push %rbx
8f1: 48 83 ec 10 sub $0x10,%rsp
8f5: 65 48 8b 04 25 28 00 mov %gs:0x28,%rax
8fc: 00 00
8fe: 48 89 44 24 08 mov %rax,0x8(%rsp)
903: 31 c0 xor %eax,%eax
...
a2e: e8 00 00 00 00 callq a33 <start_secondary+0x143>
a2f: R_X86_64_PLT32 cpu_startup_entry-0x4
a33: 48 8b 44 24 08 mov 0x8(%rsp),%rax
a38: 65 48 33 04 25 28 00 xor %gs:0x28,%rax
a3f: 00 00
a41: 75 12 jne a55 <start_secondary+0x165>
a43: 48 83 c4 10 add $0x10,%rsp
a47: 5b pop %rbx
a48: c3 retq
a49: 0f 01 1d 00 00 00 00 lidt 0x0(%rip) # a50 <start_secondary+0x160>
a4c: R_X86_64_PC32 debug_idt_descr-0x4
a50: e9 cb fe ff ff jmpq 920 <start_secondary+0x30>
a55: e8 00 00 00 00 callq a5a <start_secondary+0x16a>
a56: R_X86_64_PLT32 __stack_chk_fail-0x4
On Sat, Mar 14, 2020 at 04:44:51PM +0000, Sergei Trofimovich wrote:
> arch/x86/kernel/Makefile | 6 ++++++
> 1 file changed, 6 insertions(+)
>
> diff --git a/arch/x86/kernel/Makefile b/arch/x86/kernel/Makefile
> index 9b294c13809a..da9f4ea9bf4c 100644
> --- a/arch/x86/kernel/Makefile
> +++ b/arch/x86/kernel/Makefile
> @@ -11,6 +11,12 @@ extra-y += vmlinux.lds
>
> CPPFLAGS_vmlinux.lds += -U$(UTS_MACHINE)
>
> +# smpboot's init_secondary initializes stack canary.
^^^^ should be start_secondary?
> +# Make sure we don't emit stack checks before it's
> +# initialized.
> +nostackp := $(call cc-option, -fno-stack-protector)
> +CFLAGS_smpboot.o := $(nostackp)
> +
> ifdef CONFIG_FUNCTION_TRACER
> # Do not profile debug and lowlevel utilities
> CFLAGS_REMOVE_tsc.o = -pg
> --
> 2.25.1
>
On Mon, Mar 16, 2020 at 06:54:50PM +0100, Borislav Petkov wrote:
> So having a way to state "do not add stack canary checking to this
> particular function" would be optimal. And since you already have the
> "stack_protect" function attribute I figure adding a "no_stack_protect"
> one should be easy...
Easy, but a waste when GCC already has the optimize attribute that can
handle also ~450 other options that are per-function rather than per-TU.
Jakub
On Mon, Mar 16, 2020 at 02:20:00PM -0400, Arvind Sankar wrote:
> On Mon, Mar 16, 2020 at 06:54:50PM +0100, Borislav Petkov wrote:
> > On Mon, Mar 16, 2020 at 02:42:34PM +0100, Peter Zijlstra wrote:
> > > Right I know, I looked for it recently :/ But since this is new in 10
> > > and 10 isn't released yet, I figured someone can add the attribute
> > > before it does get released.
> >
> > Yes, that would be a good solution.
> >
> > I looked at what happens briefly after building gcc10 from git and IINM,
> > the function in question - start_secondary() - already gets the stack
> > canary asm glue added so it checks for a stack canary.
> >
> > However, the stack canary value itself gets set later in that same
> > function:
> >
> > /* to prevent fake stack check failure in clock setup */
> > boot_init_stack_canary();
> >
> > so the asm glue which checks for it would need to reload the newly
> > computed canary value (it is 0 before we compute it and thus the
> > mismatch).
> >
> > So having a way to state "do not add stack canary checking to this
> > particular function" would be optimal. And since you already have the
> > "stack_protect" function attribute I figure adding a "no_stack_protect"
> > one should be easy...
> >
> > > > Or of course you could add noinline attribute to whatever got inlined
> > > > and contains some array or addressable variable that whatever
> > > > -fstack-protector* mode kernel uses triggers it. With -fstack-protector-all
> > > > it would never work even in the past I believe.
> > >
> > > I don't think the kernel supports -fstack-protector-all, but I could be
> > > mistaken.
> >
> > The other thing I was thinking was to carve out only that function into
> > a separate compilation unit and disable stack protector only for it.
> >
> > All IMHO of course.
> >
> > Thx.
> >
> > --
> > Regards/Gruss,
> > Boris.
> >
> > https://people.kernel.org/tglx/notes-about-netiquette
>
> With STACKPROTECTOR_STRONG, gcc9 (at least gentoo's version, not sure if
> they have some patches that affect it) already adds stack canary into
> start_secondary. Not sure why it doesn't panic already with gcc9?
>
> 00000000000008f0 <start_secondary>:
> 8f0: 53 push %rbx
> 8f1: 48 83 ec 10 sub $0x10,%rsp
> 8f5: 65 48 8b 04 25 28 00 mov %gs:0x28,%rax
> 8fc: 00 00
> 8fe: 48 89 44 24 08 mov %rax,0x8(%rsp)
> 903: 31 c0 xor %eax,%eax
> ...
> a2e: e8 00 00 00 00 callq a33 <start_secondary+0x143>
> a2f: R_X86_64_PLT32 cpu_startup_entry-0x4
> a33: 48 8b 44 24 08 mov 0x8(%rsp),%rax
> a38: 65 48 33 04 25 28 00 xor %gs:0x28,%rax
> a3f: 00 00
> a41: 75 12 jne a55 <start_secondary+0x165>
> a43: 48 83 c4 10 add $0x10,%rsp
> a47: 5b pop %rbx
> a48: c3 retq
> a49: 0f 01 1d 00 00 00 00 lidt 0x0(%rip) # a50 <start_secondary+0x160>
> a4c: R_X86_64_PC32 debug_idt_descr-0x4
> a50: e9 cb fe ff ff jmpq 920 <start_secondary+0x30>
> a55: e8 00 00 00 00 callq a5a <start_secondary+0x16a>
> a56: R_X86_64_PLT32 __stack_chk_fail-0x4
Wait a sec, this function calls cpu_startup_entry as the last thing it
does, which should never return, and hence the stack canary value should
never get checked.
How does the canary get checked with the gcc10 code?
boot_init_stack_canary depends on working if called from functions that
don't return. If that doesn't work with gcc10, we need to disable stack
protector for the other callpoints too -- start_kernel in init/main.c
and cpu_bringup_and_idle in arch/x86/xen/smp_pv.c.
/*
* Initialize the stackprotector canary value.
*
* NOTE: this must only be called from functions that never return,
* and it must always be inlined.
*/
static __always_inline void boot_init_stack_canary(void)
On Mon, Mar 16, 2020 at 02:54:21PM -0400, Arvind Sankar wrote:
> On Mon, Mar 16, 2020 at 02:20:00PM -0400, Arvind Sankar wrote:
> > On Mon, Mar 16, 2020 at 06:54:50PM +0100, Borislav Petkov wrote:
> > > On Mon, Mar 16, 2020 at 02:42:34PM +0100, Peter Zijlstra wrote:
> > > > Right I know, I looked for it recently :/ But since this is new in 10
> > > > and 10 isn't released yet, I figured someone can add the attribute
> > > > before it does get released.
> > >
> > > Yes, that would be a good solution.
> > >
> > > I looked at what happens briefly after building gcc10 from git and IINM,
> > > the function in question - start_secondary() - already gets the stack
> > > canary asm glue added so it checks for a stack canary.
> > >
> > > However, the stack canary value itself gets set later in that same
> > > function:
> > >
> > > /* to prevent fake stack check failure in clock setup */
> > > boot_init_stack_canary();
> > >
> > > so the asm glue which checks for it would need to reload the newly
> > > computed canary value (it is 0 before we compute it and thus the
> > > mismatch).
> > >
> > > So having a way to state "do not add stack canary checking to this
> > > particular function" would be optimal. And since you already have the
> > > "stack_protect" function attribute I figure adding a "no_stack_protect"
> > > one should be easy...
> > >
> > > > > Or of course you could add noinline attribute to whatever got inlined
> > > > > and contains some array or addressable variable that whatever
> > > > > -fstack-protector* mode kernel uses triggers it. With -fstack-protector-all
> > > > > it would never work even in the past I believe.
> > > >
> > > > I don't think the kernel supports -fstack-protector-all, but I could be
> > > > mistaken.
> > >
> > > The other thing I was thinking was to carve out only that function into
> > > a separate compilation unit and disable stack protector only for it.
> > >
> > > All IMHO of course.
> > >
> > > Thx.
> > >
> > > --
> > > Regards/Gruss,
> > > Boris.
> > >
> > > https://people.kernel.org/tglx/notes-about-netiquette
> >
> > With STACKPROTECTOR_STRONG, gcc9 (at least gentoo's version, not sure if
> > they have some patches that affect it) already adds stack canary into
> > start_secondary. Not sure why it doesn't panic already with gcc9?
> >
> > 00000000000008f0 <start_secondary>:
> > 8f0: 53 push %rbx
> > 8f1: 48 83 ec 10 sub $0x10,%rsp
> > 8f5: 65 48 8b 04 25 28 00 mov %gs:0x28,%rax
> > 8fc: 00 00
> > 8fe: 48 89 44 24 08 mov %rax,0x8(%rsp)
> > 903: 31 c0 xor %eax,%eax
> > ...
> > a2e: e8 00 00 00 00 callq a33 <start_secondary+0x143>
> > a2f: R_X86_64_PLT32 cpu_startup_entry-0x4
> > a33: 48 8b 44 24 08 mov 0x8(%rsp),%rax
> > a38: 65 48 33 04 25 28 00 xor %gs:0x28,%rax
> > a3f: 00 00
> > a41: 75 12 jne a55 <start_secondary+0x165>
> > a43: 48 83 c4 10 add $0x10,%rsp
> > a47: 5b pop %rbx
> > a48: c3 retq
> > a49: 0f 01 1d 00 00 00 00 lidt 0x0(%rip) # a50 <start_secondary+0x160>
> > a4c: R_X86_64_PC32 debug_idt_descr-0x4
> > a50: e9 cb fe ff ff jmpq 920 <start_secondary+0x30>
> > a55: e8 00 00 00 00 callq a5a <start_secondary+0x16a>
> > a56: R_X86_64_PLT32 __stack_chk_fail-0x4
>
> Wait a sec, this function calls cpu_startup_entry as the last thing it
> does, which should never return, and hence the stack canary value should
> never get checked.
>
> How does the canary get checked with the gcc10 code?
>
> boot_init_stack_canary depends on working if called from functions that
> don't return. If that doesn't work with gcc10, we need to disable stack
> protector for the other callpoints too -- start_kernel in init/main.c
> and cpu_bringup_and_idle in arch/x86/xen/smp_pv.c.
>
> /*
> * Initialize the stackprotector canary value.
> *
> * NOTE: this must only be called from functions that never return,
> * and it must always be inlined.
> */
> static __always_inline void boot_init_stack_canary(void)
Ugh, gcc10 tail-call optimizes the cpu_startup_entry call, and so checks
the canary before jumping out. The xen one will need to have stack
protector disabled too. It doesn't optimize the arch_call_rest_init call
in start_kernel for some reason, but we should probably disable it there
too.
a06: 0f ae f8 sfence
a09: 48 8b 44 24 08 mov 0x8(%rsp),%rax
a0e: 65 48 2b 04 25 28 00 sub %gs:0x28,%rax
a15: 00 00
a17: 75 1b jne a34 <start_secondary+0x164>
a19: 48 83 c4 10 add $0x10,%rsp
a1d: bf 8d 00 00 00 mov $0x8d,%edi
a22: 5b pop %rbx
a23: e9 00 00 00 00 jmpq a28 <start_secondary+0x158>
a24: R_X86_64_PLT32 cpu_startup_entry-0x4
a28: 0f 01 1d 00 00 00 00 lidt 0x0(%rip) # a2f <start_secondary+0x15f>
a2b: R_X86_64_PC32 debug_idt_descr-0x4
a2f: e9 cc fe ff ff jmpq 900 <start_secondary+0x30>
a34: e8 00 00 00 00 callq a39 <start_secondary+0x169>
a35: R_X86_64_PLT32 __stack_chk_fail-0x4
On Mon, Mar 16, 2020 at 03:53:41PM -0400, Arvind Sankar wrote:
> > /*
> > * Initialize the stackprotector canary value.
> > *
> > * NOTE: this must only be called from functions that never return,
> > * and it must always be inlined.
> > */
> > static __always_inline void boot_init_stack_canary(void)
>
> Ugh, gcc10 tail-call optimizes the cpu_startup_entry call, and so checks
> the canary before jumping out. The xen one will need to have stack
> protector disabled too. It doesn't optimize the arch_call_rest_init call
> in start_kernel for some reason, but we should probably disable it there
> too.
If you mark cpu_startup_entry with __attribute__((noreturn)), then gcc won't
tail call it.
If you don't, you could add asm (""); after the call to avoid the tail call
too.
>
> a06: 0f ae f8 sfence
> a09: 48 8b 44 24 08 mov 0x8(%rsp),%rax
> a0e: 65 48 2b 04 25 28 00 sub %gs:0x28,%rax
> a15: 00 00
> a17: 75 1b jne a34 <start_secondary+0x164>
> a19: 48 83 c4 10 add $0x10,%rsp
> a1d: bf 8d 00 00 00 mov $0x8d,%edi
> a22: 5b pop %rbx
> a23: e9 00 00 00 00 jmpq a28 <start_secondary+0x158>
> a24: R_X86_64_PLT32 cpu_startup_entry-0x4
> a28: 0f 01 1d 00 00 00 00 lidt 0x0(%rip) # a2f <start_secondary+0x15f>
> a2b: R_X86_64_PC32 debug_idt_descr-0x4
> a2f: e9 cc fe ff ff jmpq 900 <start_secondary+0x30>
> a34: e8 00 00 00 00 callq a39 <start_secondary+0x169>
> a35: R_X86_64_PLT32 __stack_chk_fail-0x4
Jakub
On Mon, Mar 16, 2020 at 09:08:55PM +0100, Jakub Jelinek wrote:
> On Mon, Mar 16, 2020 at 03:53:41PM -0400, Arvind Sankar wrote:
> > > /*
> > > * Initialize the stackprotector canary value.
> > > *
> > > * NOTE: this must only be called from functions that never return,
> > > * and it must always be inlined.
> > > */
> > > static __always_inline void boot_init_stack_canary(void)
> >
> > Ugh, gcc10 tail-call optimizes the cpu_startup_entry call, and so checks
> > the canary before jumping out. The xen one will need to have stack
> > protector disabled too. It doesn't optimize the arch_call_rest_init call
> > in start_kernel for some reason, but we should probably disable it there
> > too.
>
> If you mark cpu_startup_entry with __attribute__((noreturn)), then gcc won't
> tail call it.
So I've actually noticed this before -- what's the reason we don't
tail-call optimize calls to noreturn functions? The code remains a call
with nothing after it, but wouldn't a jump still be better? Or if it has
to be a call, at least an undefined opcode or int 3 after the call in
case it was erroneously annotated and returns anyway?
objtool apparently doesn't like what gcc does when calling noreturn
functions, complains about control falling through to next function.
Though it isn't good enough to detect that that's happening even in
start_secondary because there's some cold code following the call :)
init/main.o: warning: objtool: rest_init() falls through to next function kernel_init()
> If you don't, you could add asm (""); after the call to avoid the tail call
> too.
> >
> > a06: 0f ae f8 sfence
> > a09: 48 8b 44 24 08 mov 0x8(%rsp),%rax
> > a0e: 65 48 2b 04 25 28 00 sub %gs:0x28,%rax
> > a15: 00 00
> > a17: 75 1b jne a34 <start_secondary+0x164>
> > a19: 48 83 c4 10 add $0x10,%rsp
> > a1d: bf 8d 00 00 00 mov $0x8d,%edi
> > a22: 5b pop %rbx
> > a23: e9 00 00 00 00 jmpq a28 <start_secondary+0x158>
> > a24: R_X86_64_PLT32 cpu_startup_entry-0x4
> > a28: 0f 01 1d 00 00 00 00 lidt 0x0(%rip) # a2f <start_secondary+0x15f>
> > a2b: R_X86_64_PC32 debug_idt_descr-0x4
> > a2f: e9 cc fe ff ff jmpq 900 <start_secondary+0x30>
> > a34: e8 00 00 00 00 callq a39 <start_secondary+0x169>
> > a35: R_X86_64_PLT32 __stack_chk_fail-0x4
>
> Jakub
>
On Mon, 16 Mar 2020 14:26:48 +0100
Jakub Jelinek <[email protected]> wrote:
> > > +# smpboot's init_secondary initializes stack canary.
> > > +# Make sure we don't emit stack checks before it's
> > > +# initialized.
> > > +nostackp := $(call cc-option, -fno-stack-protector)
> > > +CFLAGS_smpboot.o := $(nostackp)
> >
> > What makes GCC10 insert this while GCC9 does not. Also, I would much
>
> My bet is different inlining decisions.
> If somebody hands me over the preprocessed source + gcc command line, I can
> have a look in detail (which exact change and why).
In case you are still interested in preprocessed files and results I've collected
all the bits in a single tarball:
https://dev.gentoo.org/~slyfox/bugs/linux-gcc-10-boot-2020-03-14.tar.gz
Same available in separate files in:
https://dev.gentoo.org/~slyfox/bugs/linux-gcc-10-boot-2020-03-14/
Specifically:
- gcc-v.gcc-{9,10}: gcc-v output of both compilers. Note --enable-default-pie --enable-default-ssp.
- config.gcc-{9,10}: note, they are not identical as Kbuild does not recognize gcc-10's
plugin support. I don't use it though.
- boot-crash-gcc-10.jpg: picture of a full boot crash
- command.gcc-{9,10} called to generate .s files (it's almost the same when building .o files)
- arch-x86-kernel-smpboot.s-gcc-{9,10}: asm files, gennerated with 'make arch/x86/kernel/smpboot.s V=1'
- arch-x86-kernel-smpboot.c.c-gcc-{9,10}: preprocessed files, generated from command by changing -S to -E.
Another observation: kernel built by gcc-10 boots as-is in qemu without patches.
I wonder if the following boot line right before the crash has something to do wit it:
"random: get_random_bgtes called from start_secondary+0x105/0x1a0 with crng_init=0"
I hope it's not a race of async canary initialization and canary use.
Only one CPU is booted at that time, yes?
--
Sergei
On Mon, Mar 16, 2020 at 10:12:51PM +0000, Sergei Trofimovich wrote:
> In case you are still interested in preprocessed files and results I've collected
> all the bits in a single tarball:
> https://dev.gentoo.org/~slyfox/bugs/linux-gcc-10-boot-2020-03-14.tar.gz
> Same available in separate files in:
> https://dev.gentoo.org/~slyfox/bugs/linux-gcc-10-boot-2020-03-14/
Thanks, that is helpful.
So, a few comments.
One thing I've noticed in the command line is that
--param=allow-store-data-races=0
got dropped. That is fine, the parameter is gone, but it has been replaced
in https://gcc.gnu.org/PR92046 by the
-fno-allow-store-data-races
option. Like the param which defaulted to 0 and has been enabled only with
-Ofast this option is also -fno-allow-store-data-races by default unless
-Ofast, but if kernel wanted to be explicit or make sure not to introduce
them even with -Ofast, I'd say it should:
# Tell gcc to never replace conditional load with a non-conditional one
KBUILD_CFLAGS += $(call cc-option,--param=allow-store-data-races=0)
+KBUILD_CFLAGS += $(call cc-option,-fno-allow-store-data-races)
in the toplevel Makefile.
The actual change that causes cpu_startup_entry to be tail called in GCC 10
was my https://gcc.gnu.org/PR59813 https://gcc.gnu.org/PR89060
https://gcc.gnu.org/r10-230-gab87ac8d53f3d903bfc9eeb0f0b5e7eed1f38cbc
optimization. Previously, GCC punted just because it saw some earlier call
which passed address of some automatic variable to some other function and
escaped it that way (and could be possibly used during the function
considered for tail call, thus making the tail call not possible as with
tail call the frame is gone then). Now, GCC tries to use information about
scopes to determine that eventhough some automatic variables can escape that
way, if they aren't in the scope anymore during the last call, they
shouldn't be problematic. There are two variables that prevented the tail
call optimization in the past it seems, int cpuid; in smp_callin function
which is inlined, then its address escapes:
__dynamic_pr_debug(&__UNIQUE_ID_ddebug114, "smpboot" ": " "Stack at about %p\n", &cpuid);
and then cpuid goes out of scope.
Similarly, the u64 canary; variable in boot_init_stack_canary which is also
inlined. The address escapes in
get_random_bytes(&canary, sizeof(canary));
and later on is used and eventually goes out of scope.
Finally, there is the cpu_startup_entry call at the end of function.
Regarding the reason why GCC doesn't tailcall noreturn functions, that is a
deliberate decision, often that results in shorter code (there is no need to
restore stack etc. before the call and because it is noreturn, anything
after the call can be thrown away as unreachable) and more importantly,
results in more useful backtraces when something calls abort etc.
Hope this helps.
Jakub
On Mon, Mar 16, 2020 at 07:03:03PM +0100, Jakub Jelinek wrote:
> On Mon, Mar 16, 2020 at 06:54:50PM +0100, Borislav Petkov wrote:
> > So having a way to state "do not add stack canary checking to this
> > particular function" would be optimal. And since you already have the
> > "stack_protect" function attribute I figure adding a "no_stack_protect"
> > one should be easy...
>
> Easy, but a waste when GCC already has the optimize attribute that can
> handle also ~450 other options that are per-function rather than per-TU.
Ok, Micha explained to me what you mean here and I did:
static void __attribute__((optimize("no-stack-protect"))) notrace start_secondary(void *unused)
{
but it said
arch/x86/kernel/smpboot.c:216:1: warning: bad option ‘-fno-stack-protect’ to attribute ‘optimize’ [-Wattributes]
216 | {
| ^
because -fno-stack-protect is not implemented yet.
Regardless, yes, that can work too, if we had the -fno-stack-protect
variant.
Thx.
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
On Tue, Mar 17, 2020 at 03:36:02PM +0100, Borislav Petkov wrote:
> On Mon, Mar 16, 2020 at 07:03:03PM +0100, Jakub Jelinek wrote:
> > On Mon, Mar 16, 2020 at 06:54:50PM +0100, Borislav Petkov wrote:
> > > So having a way to state "do not add stack canary checking to this
> > > particular function" would be optimal. And since you already have the
> > > "stack_protect" function attribute I figure adding a "no_stack_protect"
> > > one should be easy...
> >
> > Easy, but a waste when GCC already has the optimize attribute that can
> > handle also ~450 other options that are per-function rather than per-TU.
>
> Ok, Micha explained to me what you mean here and I did:
>
> static void __attribute__((optimize("no-stack-protect"))) notrace start_secondary(void *unused)
> {
>
> but it said
That is because the option is called -fno-stack-protector, so one needs to
use __attribute__((optimize("no-stack-protector")))
Jakub
On Tue, Mar 17, 2020 at 03:39:14PM +0100, Jakub Jelinek wrote:
> That is because the option is called -fno-stack-protector, so one needs to
> use __attribute__((optimize("no-stack-protector")))
Ha, that works even! :-)
And my guest boots.
I guess we can do that then. Looks like the optimal solution to me.
Also, I'm assuming older gccs will simply ignore unknown optimize
options?
Thx.
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
From: Borislav Petkov
> Sent: 17 March 2020 14:50
> On Tue, Mar 17, 2020 at 03:39:14PM +0100, Jakub Jelinek wrote:
> > That is because the option is called -fno-stack-protector, so one needs to
> > use __attribute__((optimize("no-stack-protector")))
>
> Ha, that works even! :-)
>
> And my guest boots.
>
> I guess we can do that then. Looks like the optimal solution to me.
>
> Also, I'm assuming older gccs will simply ignore unknown optimize
> options?
Unlikely since it rejected the misspelt one.
David
-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)
On Tue, 17 Mar 2020 12:46:05 +0100
Jakub Jelinek <[email protected]> wrote:
> So, a few comments.
>
> One thing I've noticed in the command line is that
> --param=allow-store-data-races=0
> got dropped. That is fine, the parameter is gone, but it has been replaced
> in https://gcc.gnu.org/PR92046 by the
> -fno-allow-store-data-races
> option. Like the param which defaulted to 0 and has been enabled only with
> -Ofast this option is also -fno-allow-store-data-races by default unless
> -Ofast, but if kernel wanted to be explicit or make sure not to introduce
> them even with -Ofast, I'd say it should:
> # Tell gcc to never replace conditional load with a non-conditional one
> KBUILD_CFLAGS += $(call cc-option,--param=allow-store-data-races=0)
> +KBUILD_CFLAGS += $(call cc-option,-fno-allow-store-data-races)
> in the toplevel Makefile.
Yeah, I noticed it yesterday as well and sent out exactly the same change:
https://lkml.org/lkml/2020/3/16/1012
I also checked that flag does not change code generation on -O2 for smpboot.c
--
Sergei
On Tue, Mar 17, 2020 at 03:49:42PM +0100, Borislav Petkov wrote:
> On Tue, Mar 17, 2020 at 03:39:14PM +0100, Jakub Jelinek wrote:
> > That is because the option is called -fno-stack-protector, so one needs to
> > use __attribute__((optimize("no-stack-protector")))
>
> Ha, that works even! :-)
>
> And my guest boots.
>
> I guess we can do that then. Looks like the optimal solution to me.
>
> Also, I'm assuming older gccs will simply ignore unknown optimize
> options?
Sergei,
wanna send v2 where you add the function attribute to start_secondary()
only instead?
Thx.
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
On Wed, 25 Mar 2020 14:31:37 +0100
Borislav Petkov <[email protected]> wrote:
> On Tue, Mar 17, 2020 at 03:49:42PM +0100, Borislav Petkov wrote:
> > On Tue, Mar 17, 2020 at 03:39:14PM +0100, Jakub Jelinek wrote:
> > > That is because the option is called -fno-stack-protector, so one needs to
> > > use __attribute__((optimize("no-stack-protector")))
> >
> > Ha, that works even! :-)
> >
> > And my guest boots.
> >
> > I guess we can do that then. Looks like the optimal solution to me.
> >
> > Also, I'm assuming older gccs will simply ignore unknown optimize
> > options?
>
> Sergei,
>
> wanna send v2 where you add the function attribute to start_secondary()
> only instead?
Will attempt to write, test and send out the patch by tomorrow.
--
Sergei
On Thu, Mar 26, 2020 at 09:54:24PM +0000, Sergei Trofimovich wrote:
> Will attempt to write, test and send out the patch by tomorrow.
No hurry - we'll have merge window opening next Monday so I'll queue
yours after ~2 weeks. So later's fine too.
Thx.
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
The change fixes boot failure on physical machine where kernel
is built with gcc-10 with stack protector enabled by default:
```
Kernel panic — not syncing: stack-protector: Kernel stack is corrupted in: start_secondary+0x191/0x1a0
CPU: 1 PID: 0 Comm: swapper/1 Not tainted 5.6.0-rc5—00235—gfffb08b37df9 #139
Hardware name: Gigabyte Technology Co., Ltd. To be filled by O.E.M./H77M—D3H, BIOS F12 11/14/2013
Call Trace:
dump_stack+0x71/0xa0
panic+0x107/0x2b8
? start_secondary+0x191/0x1a0
__stack_chk_fail+0x15/0x20
start_secondary+0x191/0x1a0
secondary_startup_64+0xa4/0xb0
-—-[ end Kernel panic — not syncing: stack—protector: Kernel stack is corrupted in: start_secondary+0x191
```
This happens because `start_secondary()` is responsible for setting
up initial stack canary value in `smpboot.c`, but nothing prevents
gcc from inserting stack canary into `start_secondary()` itself
before `boot_init_stack_canary()` call.
The fix inhibits stack canary check foa single `start_secondary()`
function.
Tested the change by successfully booting the machine.
A few similar crashes on VMs:
- https://bugzilla.redhat.com/show_bug.cgi?id=1796780
- http://rglinuxtech.com/?p=2694
CC: Jakub Jelinek <[email protected]>
CC: Thomas Gleixner <[email protected]>
CC: Ingo Molnar <[email protected]>
CC: Borislav Petkov <[email protected]>
CC: "H. Peter Anvin" <[email protected]>
CC: Andy Lutomirski <[email protected]>
CC: Peter Zijlstra <[email protected]>
CC: Michael Matz <[email protected]>
CC: [email protected]
Signed-off-by: Sergei Trofimovich <[email protected]>
---
arch/x86/kernel/smpboot.c | 5 ++++-
include/linux/compiler-gcc.h | 1 +
include/linux/compiler_types.h | 4 ++++
3 files changed, 9 insertions(+), 1 deletion(-)
diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
index 69881b2d446c..99a4cb631a64 100644
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -207,8 +207,11 @@ static int cpu0_logical_apicid;
static int enable_start_cpu0;
/*
* Activate a secondary processor.
+ *
+ * Note: 'boot_init_stack_canary' changes canary value. Omit
+ * stack protection to avoid canary check (and boot) failure.
*/
-static void notrace start_secondary(void *unused)
+static void __no_stack_protector notrace start_secondary(void *unused)
{
/*
* Don't put *anything* except direct CPU state initialization
diff --git a/include/linux/compiler-gcc.h b/include/linux/compiler-gcc.h
index d7ee4c6bad48..fb67c743138c 100644
--- a/include/linux/compiler-gcc.h
+++ b/include/linux/compiler-gcc.h
@@ -172,3 +172,4 @@
#endif
#define __no_fgcse __attribute__((optimize("-fno-gcse")))
+#define __no_stack_protector __attribute__((optimize("-fno-stack-protector")))
diff --git a/include/linux/compiler_types.h b/include/linux/compiler_types.h
index 72393a8c1a6c..9d5de1ea0b03 100644
--- a/include/linux/compiler_types.h
+++ b/include/linux/compiler_types.h
@@ -212,6 +212,10 @@ struct ftrace_likely_data {
#define asm_inline asm
#endif
+#ifndef __no_stack_protector
+# define __no_stack_protector
+#endif
+
#ifndef __no_fgcse
# define __no_fgcse
#endif
--
2.26.0
The change fixes boot failure on physical machine where kernel
is built with gcc-10 with stack protector enabled by default:
```
Kernel panic — not syncing: stack-protector: Kernel stack is corrupted in: start_secondary+0x191/0x1a0
CPU: 1 PID: 0 Comm: swapper/1 Not tainted 5.6.0-rc5—00235—gfffb08b37df9 #139
Hardware name: Gigabyte Technology Co., Ltd. To be filled by O.E.M./H77M—D3H, BIOS F12 11/14/2013
Call Trace:
dump_stack+0x71/0xa0
panic+0x107/0x2b8
? start_secondary+0x191/0x1a0
__stack_chk_fail+0x15/0x20
start_secondary+0x191/0x1a0
secondary_startup_64+0xa4/0xb0
-—-[ end Kernel panic — not syncing: stack—protector: Kernel stack is corrupted in: start_secondary+0x191
```
This happens because `start_secondary()` is responsible for setting
up initial stack canary value in `smpboot.c`, but nothing prevents
gcc from inserting stack canary into `start_secondary()` itself
before `boot_init_stack_canary()` call.
The fix inhibits stack canary check foa single `start_secondary()`
function.
Tested the change by successfully booting the machine.
A few similar crashes on VMs:
- https://bugzilla.redhat.com/show_bug.cgi?id=1796780
- http://rglinuxtech.com/?p=2694
CC: Jakub Jelinek <[email protected]>
CC: Thomas Gleixner <[email protected]>
CC: Ingo Molnar <[email protected]>
CC: Borislav Petkov <[email protected]>
CC: "H. Peter Anvin" <[email protected]>
CC: Andy Lutomirski <[email protected]>
CC: Peter Zijlstra <[email protected]>
CC: Michael Matz <[email protected]>
CC: [email protected]
Signed-off-by: Sergei Trofimovich <[email protected]>
---
arch/x86/kernel/smpboot.c | 5 ++++-
include/linux/compiler-gcc.h | 1 +
include/linux/compiler_types.h | 4 ++++
3 files changed, 9 insertions(+), 1 deletion(-)
diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
index 69881b2d446c..99a4cb631a64 100644
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -207,8 +207,11 @@ static int cpu0_logical_apicid;
static int enable_start_cpu0;
/*
* Activate a secondary processor.
+ *
+ * Note: 'boot_init_stack_canary' changes canary value. Omit
+ * stack protection to avoid canary check (and boot) failure.
*/
-static void notrace start_secondary(void *unused)
+static void __no_stack_protector notrace start_secondary(void *unused)
{
/*
* Don't put *anything* except direct CPU state initialization
diff --git a/include/linux/compiler-gcc.h b/include/linux/compiler-gcc.h
index d7ee4c6bad48..fb67c743138c 100644
--- a/include/linux/compiler-gcc.h
+++ b/include/linux/compiler-gcc.h
@@ -172,3 +172,4 @@
#endif
#define __no_fgcse __attribute__((optimize("-fno-gcse")))
+#define __no_stack_protector __attribute__((optimize("-fno-stack-protector")))
diff --git a/include/linux/compiler_types.h b/include/linux/compiler_types.h
index 72393a8c1a6c..9d5de1ea0b03 100644
--- a/include/linux/compiler_types.h
+++ b/include/linux/compiler_types.h
@@ -212,6 +212,10 @@ struct ftrace_likely_data {
#define asm_inline asm
#endif
+#ifndef __no_stack_protector
+# define __no_stack_protector
+#endif
+
#ifndef __no_fgcse
# define __no_fgcse
#endif
--
2.26.0
On Sat, Mar 28, 2020 at 08:48:58AM +0000, Sergei Trofimovich wrote:
> @@ -207,8 +207,11 @@ static int cpu0_logical_apicid;
> static int enable_start_cpu0;
> /*
> * Activate a secondary processor.
> + *
> + * Note: 'boot_init_stack_canary' changes canary value. Omit
> + * stack protection to avoid canary check (and boot) failure.
> */
> -static void notrace start_secondary(void *unused)
> +static void __no_stack_protector notrace start_secondary(void *unused)
Hmm, so we did this per-function marking only but that explodes on
32-bit, see splat at the end. gcc guys, any ideas?
The null pointer deref happens this way:
The __no_stack_protector annotated function start_secondary() calls
trace_hardirqs_on(). On entry, that function pushes the frame pointer on
the stack:
trace_hardirqs_on:
pushl %ebp #
movl %esp, %ebp #,
subl $20, %esp #,
movl %ebx, -12(%ebp) #,
movl %esi, -8(%ebp) #,
movl %edi, -4(%ebp) #,
Singlestepping the whole thing in gdb looks like this:
Dump of assembler code from 0xc1158610 to 0xc1158624:
=> 0xc1158610 <trace_hardirqs_on+0>: 55 push %ebp <---
0xc1158611 <trace_hardirqs_on+1>: 89 e5 mov %esp,%ebp
and ebp has:
...
ebp 0x0 0x0 <---
esi 0x200002 2097154
edi 0x1 1
eip 0xc1158610
...
Later in the function, it will do __builtin_return_address(n), which
turns into:
# kernel/trace/trace_preemptirq.c:26: trace_irq_enable_rcuidle(CALLER_ADDR0, CALLER_ADDR1);
movl 0(%ebp), %eax #, tmp133
<--- it loads the previously pushed 0 on the stack into %eax
# kernel/trace/trace_preemptirq.c:27: tracer_hardirqs_on(CALLER_ADDR0, CALLER_ADDR1);
movl 4(%eax), %edx #, tmp130
<--- derefs it here. Boom.
So, could it be that marking this one function like this:
static void __attribute__((optimize("-fno-stack-protector"))) __attribute__((no_instrument_function)) start_secondary(void *unused)
{
would cause %ebp to be 0 for whatever reason on 32-bit?
Interestingly enough, if I use the first variant we had where we built
the whole compilation unit with -fno-stack-protector, the issue is gone
and %ebp has the correct value:
ebp 0xf1163fac 0xf1163fac
esi 0x200002 2097154
edi 0x1 1
eip 0xc11585c0 0xc11585c0 <trace_hardirqs_on>
eflags 0x200086 [ PF SF ID ]
cs 0x60 96
ss 0x68 104
ds 0x7b 123
es 0x7b 123
fs 0xd8 216
gs 0xe0 224
=> 0xc11585c0 <trace_hardirqs_on>: push %ebp
0xf1163f84: 0x00000000c104b016 0x0000000000000000
0xf1163f94: 0x0000000000000001 0x0000000000000002
0xf1163fa4: 0x0000000001000800 0xc10001e400000000
0xf1163fb4: 0x0000000000000000 0x0000000000000000
0xf1163fc4: 0x0000000000000000 0x0000000000000000
Dump of assembler code from 0xc11585c0 to 0xc11585d4:
=> 0xc11585c0 <trace_hardirqs_on+0>: 55 push %ebp
0xc11585c1 <trace_hardirqs_on+1>: 89 e5 mov %esp,%ebp
Any ideas whether 32-bit behaves like this here?
Thx.
[ 0.269147] smpboot: CPU 1 Converting physical 0 to logical die 1
[ 0.269147] BUG: kernel NULL pointer dereference, address: 00000004
[ 0.269147] #PF: supervisor read access in kernel mode
[ 0.269147] #PF: error_code(0x0000) - not-present page
[ 0.269147] *pdpt = 0000000000000000 *pde = f000ff53f000ff53
[ 0.269147] Oops: 0000 [#1] PREEMPT SMP
[ 0.269147] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 5.7.0-rc1+ #3
[ 0.269147] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.1-1 04/01/2014
[ 0.269147] EIP: trace_hardirqs_on+0x5e/0x110
[ 0.269147] Code: 00 00 64 c7 05 f8 20 c2 c1 00 00 00 00 8b 45 04 e8 e7 3b f7 ff 8b 5d f4 8b 75 f8 8b 7d fc c9 c3 8d 74 26 00 8b 15 00 b4 b3 c1 <8b> 48 04 8b 5d 04 85 d2 7e c4 64 a1 d4 a2 c0 c1 0f a3 05 1c 89 b4
[ 0.269147] EAX: 00000000 EBX: f1163f98 ECX: 00000000 EDX: 00000000
[ 0.269147] ESI: 00200002 EDI: 00000001 EBP: f1163f84 ESP: f1163f70
[ 0.269147] DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 EFLAGS: 00210046
[ 0.269147] CR0: 80050033 CR2: 00000004 CR3: 01c2e000 CR4: 003406f0
[ 0.269147] Call Trace:
[ 0.269147] ? _raw_spin_unlock+0x27/0x50
[ 0.269147] start_secondary+0x159/0x220
[ 0.269147] ? startup_32_smp+0x164/0x168
[ 0.269147] Modules linked in:
[ 0.269147] CR2: 0000000000000004
[ 0.269147] ---[ end trace e721c1dd98762fde ]---
[ 0.269147] EIP: trace_hardirqs_on+0x5e/0x110
[ 0.269147] Code: 00 00 64 c7 05 f8 20 c2 c1 00 00 00 00 8b 45 04 e8 e7 3b f7 ff 8b 5d f4 8b 75 f8 8b 7d fc c9 c3 8d 74 26 00 8b 15 00 b4 b3 c1 <8b> 48 04 8b 5d 04 85 d2 7e c4 64 a1 d4 a2 c0 c1 0f a3 05 1c 89 b4
[ 0.269147] EAX: 00000000 EBX: f1163f98 ECX: 00000000 EDX: 00000000
[ 0.269147] ESI: 00200002 EDI: 00000001 EBP: f1163f84 ESP: f1163f70
[ 0.269147] DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 EFLAGS: 00210046
[ 0.269147] CR0: 80050033 CR2: 00000004 CR3: 01c2e000 CR4: 003406f0
[ 0.269147] Kernel panic - not syncing: Attempted to kill the idle task!
[ 0.269147] ---[ end Kernel panic - not syncing: Attempted to kill the idle task! ]---
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
The following commit has been merged into the x86/urgent branch of tip:
Commit-ID: 5871c72d659e5c312b9ad635034cab59f7786a98
Gitweb: https://git.kernel.org/tip/5871c72d659e5c312b9ad635034cab59f7786a98
Author: Sergei Trofimovich <[email protected]>
AuthorDate: Sat, 28 Mar 2020 08:48:58
Committer: Borislav Petkov <[email protected]>
CommitterDate: Mon, 13 Apr 2020 16:07:35 +02:00
x86: Fix early boot crash on gcc-10
Fix a boot failure where the kernel is built with gcc-10 with stack
protector enabled by default:
Kernel panic — not syncing: stack-protector: Kernel stack is corrupted in: start_secondary
CPU: 1 PID: 0 Comm: swapper/1 Not tainted 5.6.0-rc5—00235—gfffb08b37df9 #139
Hardware name: Gigabyte Technology Co., Ltd. To be filled by O.E.M./H77M—D3H, BIOS F12 11/14/2013
Call Trace:
dump_stack
panic
? start_secondary
__stack_chk_fail
start_secondary
secondary_startup_64
-—-[ end Kernel panic — not syncing: stack—protector: Kernel stack is corrupted in: start_secondary
This happens because start_secondary() is responsible for setting
up initial stack canary value in smpboot.c but nothing prevents gcc
from inserting stack canary into start_secondary() itself before the
boot_init_stack_canary() call which sets up said canary value.
Inhibit the stack canary addition for start_secondary() only.
[ bp: Massage a bit. ]
Signed-off-by: Sergei Trofimovich <[email protected]>
Signed-off-by: Borislav Petkov <[email protected]>
Cc: Jakub Jelinek <[email protected]>
Cc: Michael Matz <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
---
arch/x86/kernel/smpboot.c | 6 +++++-
include/linux/compiler-gcc.h | 1 +
include/linux/compiler_types.h | 4 ++++
3 files changed, 10 insertions(+), 1 deletion(-)
diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
index fe3ab96..9ea28e5 100644
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -209,10 +209,14 @@ static void smp_callin(void)
static int cpu0_logical_apicid;
static int enable_start_cpu0;
+
/*
* Activate a secondary processor.
+ *
+ * Note: boot_init_stack_canary() sets up the canary value so omit the stack
+ * canary creation for this function only.
*/
-static void notrace start_secondary(void *unused)
+static void __no_stack_protector notrace start_secondary(void *unused)
{
/*
* Don't put *anything* except direct CPU state initialization
diff --git a/include/linux/compiler-gcc.h b/include/linux/compiler-gcc.h
index d7ee4c6..fb67c74 100644
--- a/include/linux/compiler-gcc.h
+++ b/include/linux/compiler-gcc.h
@@ -172,3 +172,4 @@
#endif
#define __no_fgcse __attribute__((optimize("-fno-gcse")))
+#define __no_stack_protector __attribute__((optimize("-fno-stack-protector")))
diff --git a/include/linux/compiler_types.h b/include/linux/compiler_types.h
index e970f97..069c981 100644
--- a/include/linux/compiler_types.h
+++ b/include/linux/compiler_types.h
@@ -203,6 +203,10 @@ struct ftrace_likely_data {
#define asm_inline asm
#endif
+#ifndef __no_stack_protector
+# define __no_stack_protector
+#endif
+
#ifndef __no_fgcse
# define __no_fgcse
#endif
Hello,
On Mon, 13 Apr 2020, Borislav Petkov wrote:
> > @@ -207,8 +207,11 @@ static int cpu0_logical_apicid;
> > static int enable_start_cpu0;
> > /*
> > * Activate a secondary processor.
> > + *
> > + * Note: 'boot_init_stack_canary' changes canary value. Omit
> > + * stack protection to avoid canary check (and boot) failure.
> > */
> > -static void notrace start_secondary(void *unused)
> > +static void __no_stack_protector notrace start_secondary(void *unused)
>
> Hmm, so we did this per-function marking only but that explodes on
> 32-bit, see splat at the end. gcc guys, any ideas?
>
> The null pointer deref happens this way:
>
> The __no_stack_protector annotated function start_secondary() calls
> trace_hardirqs_on(). On entry, that function pushes the frame pointer on
> the stack:
>
> trace_hardirqs_on:
> pushl %ebp #
> movl %esp, %ebp #,
> subl $20, %esp #,
> movl %ebx, -12(%ebp) #,
> movl %esi, -8(%ebp) #,
> movl %edi, -4(%ebp) #,
>
>
> Singlestepping the whole thing in gdb looks like this:
>
> Dump of assembler code from 0xc1158610 to 0xc1158624:
> => 0xc1158610 <trace_hardirqs_on+0>: 55 push %ebp <---
> 0xc1158611 <trace_hardirqs_on+1>: 89 e5 mov %esp,%ebp
>
> and ebp has:
>
> ...
> ebp 0x0 0x0 <---
> esi 0x200002 2097154
> edi 0x1 1
> eip 0xc1158610
> ...
>
> Later in the function, it will do __builtin_return_address(n), which
> turns into:
>
> # kernel/trace/trace_preemptirq.c:26: trace_irq_enable_rcuidle(CALLER_ADDR0, CALLER_ADDR1);
> movl 0(%ebp), %eax #, tmp133
> # kernel/trace/trace_preemptirq.c:27: tracer_hardirqs_on(CALLER_ADDR0, CALLER_ADDR1);
> movl 4(%eax), %edx #, tmp130
So this part expects that the caller (!) of trace_hardirqs_on was compiled
with a frame pointer (in %ebp). Obviously that's not the case as you
traced above. Is start_secondary the immediate caller in the above
case?
> <--- derefs it here. Boom.
>
> So, could it be that marking this one function like this:
>
> static void __attribute__((optimize("-fno-stack-protector"))) __attribute__((no_instrument_function)) start_secondary(void *unused)
> {
>
> would cause %ebp to be 0 for whatever reason on 32-bit?
Look at it's disassembly. If it doesn't have the usual push
%ebp/mov%esp,%ebp prologue it probably doesn't use a frame pointer. In
that case I would speculate that having a frame pointer was (before the
change above) only a side-effect of being compiled with -fstack-protector,
which got now disabled. But I was under the impression that the upstream
kernels build with -fno-omit-frame-pointer, so that sounds unexpected.
But I have no better explanation at the moment. If the above speculation
doesn't make you progress: preprocessed file and a note of what the
immediate caller of trace_hardirqs_on is in the above case, please.
Ciao,
Michael.
On Tue, Apr 14, 2020 at 01:50:29PM +0000, Michael Matz wrote:
> So this part expects that the caller (!) of trace_hardirqs_on was compiled
> with a frame pointer (in %ebp).
/me looks at the .s file...
options passed comment at the top has -fno-omit-frame-pointer
> Obviously that's not the case as you traced above. Is start_secondary
> the immediate caller in the above case?
Yes, start_secondary() is the function which is marked as
__attribute__((optimize("-fno-stack-protector"))) and it does:
# arch/x86/kernel/smpboot.c:264: local_irq_enable();
call trace_hardirqs_on #
(the local_irq_enable() is a macro which has the call to
trace_hardirqs_on().
> Look at it's disassembly. If it doesn't have the usual push
> %ebp/mov%esp,%ebp prologue it probably doesn't use a frame pointer.
Here's the preamble:
.text
.p2align 4
.type start_secondary, @function
start_secondary:
pushl %esi #
pushl %ebx #
subl $28, %esp #,
# arch/x86/kernel/smpboot.c:226: cr4_init();
call cr4_init #
...
Those two pushes on entry are for callee-saved regs, AFAICT, because it
is going to use them later. But no frame setup.
> In that case I would speculate that having a frame pointer
> was (before the change above) only a side-effect of being
> compiled with -fstack-protector, which got now disabled.
Looks like it. And that's 32-bit. 64-bit variant of this looks more like it:
.text
.p2align 4
.type start_secondary, @function
start_secondary:
pushq %rbp #
subq $8, %rsp #,
although it doesn't save %rsp on the stack. I think it leaves space on
the stack for the canary but I'm not sure.
> But I was under the impression that the upstream kernels build with
> -fno-omit-frame-pointer, so that sounds unexpected.
Yap, see above.
> But I have no better explanation at the moment. If the above
> speculation doesn't make you progress: preprocessed file and a note of
> what the immediate caller of trace_hardirqs_on is in the above case,
> please.
Ok, I'll find you on IRC to talk details.
Thx.
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
Hello,
On Wed, 15 Apr 2020, Borislav Petkov wrote:
> On Tue, Apr 14, 2020 at 01:50:29PM +0000, Michael Matz wrote:
> > So this part expects that the caller (!) of trace_hardirqs_on was compiled
> > with a frame pointer (in %ebp).
>
> /me looks at the .s file...
>
> options passed comment at the top has -fno-omit-frame-pointer
>
> > Obviously that's not the case as you traced above. Is start_secondary
> > the immediate caller in the above case?
>
> Yes, start_secondary() is the function which is marked as
> __attribute__((optimize("-fno-stack-protector"))) and it does:
>
> # arch/x86/kernel/smpboot.c:264: local_irq_enable();
> call trace_hardirqs_on #
>
> (the local_irq_enable() is a macro which has the call to
> trace_hardirqs_on().
>
> > Look at it's disassembly. If it doesn't have the usual push
> > %ebp/mov%esp,%ebp prologue it probably doesn't use a frame pointer.
>
> Here's the preamble:
>
> .text
> .p2align 4
> .type start_secondary, @function
> start_secondary:
> pushl %esi #
> pushl %ebx #
Right. So meanwhile it became clear: the optimize function attribute
doesn't work cumulative but rather replaces all cmdline args (the
optimization ones, but that roughly translates to -fxxx options). In this
case an 'optimize("-fno-stack-protector")' also disables the crucial
-fno-omit-frame-pointer, reverting to the compilers default, which,
depending on version, is also to omit the frame pointer on 32bit. You
could fix that by adding ',-fno-omit-frame-pointer' to the attribute
string. But that quickly gets out of hand, considering all the options
you carefully need to set in Makefiles to get the right behaviour. (Note
that e.g. the optimization level is reset to -O0 as well!).
(I'll admit that I was somewhat surprised by this behaviour, even though
it makes sense in the abstract; resetting to a clean slate and
everything).
I think in its current form the optimize attribute is not useful for the
purposes you need, and you're better off to disable the stack protector
for the whole compilation unit from the Makefile.
(That attribute is also documented as "not suitable in production code",
so go figure ;-) )
I think it will be possible to make that attribute a bit more useful
in the future, but for the time being I think you'll just want to live
without it.
Ciao,
Michael.
On Wed, 15 Apr 2020 14:53:45 +0000 (UTC)
Michael Matz <[email protected]> wrote:
> Hello,
>
> On Wed, 15 Apr 2020, Borislav Petkov wrote:
>
> > On Tue, Apr 14, 2020 at 01:50:29PM +0000, Michael Matz wrote:
> > > So this part expects that the caller (!) of trace_hardirqs_on was compiled
> > > with a frame pointer (in %ebp).
> >
> > /me looks at the .s file...
> >
> > options passed comment at the top has -fno-omit-frame-pointer
> >
> > > Obviously that's not the case as you traced above. Is start_secondary
> > > the immediate caller in the above case?
> >
> > Yes, start_secondary() is the function which is marked as
> > __attribute__((optimize("-fno-stack-protector"))) and it does:
> >
> > # arch/x86/kernel/smpboot.c:264: local_irq_enable();
> > call trace_hardirqs_on #
> >
> > (the local_irq_enable() is a macro which has the call to
> > trace_hardirqs_on().
> >
> > > Look at it's disassembly. If it doesn't have the usual push
> > > %ebp/mov%esp,%ebp prologue it probably doesn't use a frame pointer.
> >
> > Here's the preamble:
> >
> > .text
> > .p2align 4
> > .type start_secondary, @function
> > start_secondary:
> > pushl %esi #
> > pushl %ebx #
>
> Right. So meanwhile it became clear: the optimize function attribute
> doesn't work cumulative but rather replaces all cmdline args (the
> optimization ones, but that roughly translates to -fxxx options). In this
> case an 'optimize("-fno-stack-protector")' also disables the crucial
> -fno-omit-frame-pointer, reverting to the compilers default, which,
> depending on version, is also to omit the frame pointer on 32bit. You
> could fix that by adding ',-fno-omit-frame-pointer' to the attribute
> string. But that quickly gets out of hand, considering all the options
> you carefully need to set in Makefiles to get the right behaviour. (Note
> that e.g. the optimization level is reset to -O0 as well!).
>
> (I'll admit that I was somewhat surprised by this behaviour, even though
> it makes sense in the abstract; resetting to a clean slate and
> everything).
>
> I think in its current form the optimize attribute is not useful for the
> purposes you need, and you're better off to disable the stack protector
> for the whole compilation unit from the Makefile.
>
> (That attribute is also documented as "not suitable in production code",
> so go figure ;-) )
>
> I think it will be possible to make that attribute a bit more useful
> in the future, but for the time being I think you'll just want to live
> without it.
Ah, that makes sense. Borislav, should I send a fix forward against
x86 tree to move -fno-stack-protector as it was in v1 patch?
Or you'll revert v2 and apply v1 ~as is? Or should I send those myself?
--
Sergei
On Wed, Apr 15, 2020 at 11:19:30PM +0100, Sergei Trofimovich wrote:
> Ah, that makes sense. Borislav, should I send a fix forward against
> x86 tree to move -fno-stack-protector as it was in v1 patch?
> Or you'll revert v2 and apply v1 ~as is? Or should I send those myself?
Yeah, Peter and I have been discussing something like the below
yesterday. I don't like the additional exports too much but would
disable stack protector only for the one function...
---
diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
index 3bcf27caf6c9..e258a6a21674 100644
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -990,4 +990,8 @@ enum mds_mitigations {
MDS_MITIGATION_VMWERV,
};
+extern int enable_start_cpu0;
+void smp_callin(void);
+void notrace start_secondary(void *unused);
+
#endif /* _ASM_X86_PROCESSOR_H */
diff --git a/arch/x86/kernel/Makefile b/arch/x86/kernel/Makefile
index 92e1261ec4ec..7130ca9edc50 100644
--- a/arch/x86/kernel/Makefile
+++ b/arch/x86/kernel/Makefile
@@ -87,7 +87,13 @@ obj-$(CONFIG_PCI) += early-quirks.o
apm-y := apm_32.o
obj-$(CONFIG_APM) += apm.o
obj-$(CONFIG_SMP) += smp.o
-obj-$(CONFIG_SMP) += smpboot.o
+
+nostackprot := $(call cc-option, -fno-stack-protector)
+CFLAGS_smpboot_aux.o := $(nostackprot)
+
+smpboot_all-y := smpboot.o smpboot_aux.o
+obj-$(CONFIG_SMP) += smpboot_all.o
+
obj-$(CONFIG_X86_TSC) += tsc_sync.o
obj-$(CONFIG_SMP) += setup_percpu.o
obj-$(CONFIG_X86_MPPARSE) += mpparse.o
diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
index 3b9bf8c7e29d..1ce6280999f9 100644
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -153,7 +153,7 @@ static void init_freq_invariance(void);
* Report back to the Boot Processor during boot time or to the caller processor
* during CPU online.
*/
-static void smp_callin(void)
+void smp_callin(void)
{
int cpuid;
@@ -208,65 +208,7 @@ static void smp_callin(void)
}
static int cpu0_logical_apicid;
-static int enable_start_cpu0;
-/*
- * Activate a secondary processor.
- */
-static void notrace start_secondary(void *unused)
-{
- /*
- * Don't put *anything* except direct CPU state initialization
- * before cpu_init(), SMP booting is too fragile that we want to
- * limit the things done here to the most necessary things.
- */
- cr4_init();
-
-#ifdef CONFIG_X86_32
- /* switch away from the initial page table */
- load_cr3(swapper_pg_dir);
- __flush_tlb_all();
-#endif
- load_current_idt();
- cpu_init();
- x86_cpuinit.early_percpu_clock_init();
- preempt_disable();
- smp_callin();
-
- enable_start_cpu0 = 0;
-
- /* otherwise gcc will move up smp_processor_id before the cpu_init */
- barrier();
- /*
- * Check TSC synchronization with the boot CPU:
- */
- check_tsc_sync_target();
-
- speculative_store_bypass_ht_init();
-
- /*
- * Lock vector_lock, set CPU online and bring the vector
- * allocator online. Online must be set with vector_lock held
- * to prevent a concurrent irq setup/teardown from seeing a
- * half valid vector space.
- */
- lock_vector_lock();
- set_cpu_online(smp_processor_id(), true);
- lapic_online();
- unlock_vector_lock();
- cpu_set_state_online(smp_processor_id());
- x86_platform.nmi_init();
-
- /* enable local interrupts */
- local_irq_enable();
-
- /* to prevent fake stack check failure in clock setup */
- boot_init_stack_canary();
-
- x86_cpuinit.setup_percpu_clockev();
-
- wmb();
- cpu_startup_entry(CPUHP_AP_ONLINE_IDLE);
-}
+int enable_start_cpu0;
/**
* topology_is_primary_thread - Check whether CPU is the primary SMT thread
diff --git a/arch/x86/kernel/smpboot_aux.c b/arch/x86/kernel/smpboot_aux.c
new file mode 100644
index 000000000000..8863fde54eed
--- /dev/null
+++ b/arch/x86/kernel/smpboot_aux.c
@@ -0,0 +1,70 @@
+#include <linux/cpu.h>
+
+#include <asm/desc.h>
+#include <asm/hw_irq.h>
+#include <asm/spec-ctrl.h>
+#include <asm/processor.h>
+#include <asm/stackprotector.h>
+
+/*
+ * Activate a secondary processor.
+ *
+ * Note: boot_init_stack_canary() sets up the canary value so omit the stack
+ * canary creation for this function only by keeping it in a separate
+ * compilation unit.
+ */
+void notrace start_secondary(void *unused)
+{
+ /*
+ * Don't put *anything* except direct CPU state initialization
+ * before cpu_init(), SMP booting is too fragile that we want to
+ * limit the things done here to the most necessary things.
+ */
+ cr4_init();
+
+#ifdef CONFIG_X86_32
+ /* switch away from the initial page table */
+ load_cr3(swapper_pg_dir);
+ __flush_tlb_all();
+#endif
+ load_current_idt();
+ cpu_init();
+ x86_cpuinit.early_percpu_clock_init();
+ preempt_disable();
+ smp_callin();
+
+ enable_start_cpu0 = 0;
+
+ /* otherwise gcc will move up smp_processor_id before the cpu_init */
+ barrier();
+ /*
+ * Check TSC synchronization with the boot CPU:
+ */
+ check_tsc_sync_target();
+
+ speculative_store_bypass_ht_init();
+
+ /*
+ * Lock vector_lock, set CPU online and bring the vector
+ * allocator online. Online must be set with vector_lock held
+ * to prevent a concurrent irq setup/teardown from seeing a
+ * half valid vector space.
+ */
+ lock_vector_lock();
+ set_cpu_online(smp_processor_id(), true);
+ lapic_online();
+ unlock_vector_lock();
+ cpu_set_state_online(smp_processor_id());
+ x86_platform.nmi_init();
+
+ /* enable local interrupts */
+ local_irq_enable();
+
+ /* to prevent fake stack check failure in clock setup */
+ boot_init_stack_canary();
+
+ x86_cpuinit.setup_percpu_clockev();
+
+ wmb();
+ cpu_startup_entry(CPUHP_AP_ONLINE_IDLE);
+}
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
On Fri, Apr 17, 2020 at 09:57:39AM +0200, Borislav Petkov wrote:
> On Wed, Apr 15, 2020 at 11:19:30PM +0100, Sergei Trofimovich wrote:
> > Ah, that makes sense. Borislav, should I send a fix forward against
> > x86 tree to move -fno-stack-protector as it was in v1 patch?
> > Or you'll revert v2 and apply v1 ~as is? Or should I send those myself?
>
> Yeah, Peter and I have been discussing something like the below
> yesterday. I don't like the additional exports too much but would
> disable stack protector only for the one function...
If you want minimal changes, you can as I said earlier either
mark cpu_startup_entry noreturn (in the declaration in some header so that
smpboot.c sees it), or you could add something after the cpu_startup_entry
call to ensure it is not tail call optimized (e.g. just
/* Prevent tail call to cpu_startup_entry because the stack
protector guard has been changed in the middle of this function
and must not be checked before tail calling another function. */
asm ("");
would do, or for (;;); , or combine both, mark cpu_startup_entry noreturn and
add asm (""); (which most GCC versions will optimize away as unreachable).
Jakub
On Fri, Apr 17, 2020 at 10:07:26AM +0200, Jakub Jelinek wrote:
> If you want minimal changes, you can as I said earlier either
> mark cpu_startup_entry noreturn (in the declaration in some header so that
> smpboot.c sees it), or you could add something after the cpu_startup_entry
> call to ensure it is not tail call optimized (e.g. just
> /* Prevent tail call to cpu_startup_entry because the stack
> protector guard has been changed in the middle of this function
> and must not be checked before tail calling another function. */
> asm ("");
That sounds ok-ish to me too.
I know you probably can't tell the future :) but what stops gcc from
doing the tail-call optimization in the future?
Or are optimization decisions behind an inline asm a no-no and will
pretty much always stay that way?
And I hope the clang folks don't come around and say, err, nope, we're
much more aggressive here.
However, if we do it with the explicit disabling with
-fno-stack-protector for only this compilation unit, then it is
1. clear why we're doing this
2. no compiler would break it
So I'm still gravitating a bit towards the explicit thing...
Thx.
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
On Fri, Apr 17, 2020 at 10:42:24AM +0200, Borislav Petkov wrote:
> On Fri, Apr 17, 2020 at 10:07:26AM +0200, Jakub Jelinek wrote:
> > If you want minimal changes, you can as I said earlier either
> > mark cpu_startup_entry noreturn (in the declaration in some header so that
> > smpboot.c sees it), or you could add something after the cpu_startup_entry
> > call to ensure it is not tail call optimized (e.g. just
> > /* Prevent tail call to cpu_startup_entry because the stack
> > protector guard has been changed in the middle of this function
> > and must not be checked before tail calling another function. */
> > asm ("");
>
> That sounds ok-ish to me too.
>
> I know you probably can't tell the future :) but what stops gcc from
> doing the tail-call optimization in the future?
>
> Or are optimization decisions behind an inline asm a no-no and will
> pretty much always stay that way?
GCC intentionally treats asm as a black box, the only thing which it does
with it is: non-volatile asm (but asm without outputs is implicitly
volatile) can be CSEd, and if the compiler needs to estimate size, it
uses some heuristics by counting ; and newlines.
And it will stay this way.
> And I hope the clang folks don't come around and say, err, nope, we're
> much more aggressive here.
Unlike GCC, I think clang uses the builtin assembler to parse the string,
but don't know if it still treats the asms more like black boxes or not.
Certainly there is a lot of code in the wild that uses inline asm
as optimization barriers, so if it doesn't, then it would cause a lot of
problems.
Or go with the for (;;);, I don't think any compiler optimizes those away;
GCC 10 for C++ can optimize away infinite loops that have some conditional
exit because the language guarantees forward progress, but the C language
rules are different and for unconditional infinite loops GCC doesn't
optimize them away even if explicitly asked to -ffinite-loops.
Jakub
On Fri, Apr 17, 2020 at 10:58:59AM +0200, Jakub Jelinek wrote:
> On Fri, Apr 17, 2020 at 10:42:24AM +0200, Borislav Petkov wrote:
> > On Fri, Apr 17, 2020 at 10:07:26AM +0200, Jakub Jelinek wrote:
> > > If you want minimal changes, you can as I said earlier either
> > > mark cpu_startup_entry noreturn (in the declaration in some header so that
> > > smpboot.c sees it), or you could add something after the cpu_startup_entry
> > > call to ensure it is not tail call optimized (e.g. just
> > > /* Prevent tail call to cpu_startup_entry because the stack
> > > protector guard has been changed in the middle of this function
> > > and must not be checked before tail calling another function. */
> > > asm ("");
> >
> > That sounds ok-ish to me too.
> >
> > I know you probably can't tell the future :) but what stops gcc from
> > doing the tail-call optimization in the future?
> >
> > Or are optimization decisions behind an inline asm a no-no and will
> > pretty much always stay that way?
>
> GCC intentionally treats asm as a black box, the only thing which it does
> with it is: non-volatile asm (but asm without outputs is implicitly
> volatile) can be CSEd, and if the compiler needs to estimate size, it
> uses some heuristics by counting ; and newlines.
> And it will stay this way.
>
> > And I hope the clang folks don't come around and say, err, nope, we're
> > much more aggressive here.
>
> Unlike GCC, I think clang uses the builtin assembler to parse the string,
> but don't know if it still treats the asms more like black boxes or not.
> Certainly there is a lot of code in the wild that uses inline asm
> as optimization barriers, so if it doesn't, then it would cause a lot of
> problems.
>
> Or go with the for (;;);, I don't think any compiler optimizes those away;
> GCC 10 for C++ can optimize away infinite loops that have some conditional
> exit because the language guarantees forward progress, but the C language
> rules are different and for unconditional infinite loops GCC doesn't
> optimize them away even if explicitly asked to -ffinite-loops.
Lemme add Nick for clang for an opinion:
Nick, we're discussing what would be the cleanest and future-proof
way to disable stack protector for the function in the kernel which
generates the canary value as gcc10 ends up checking that value due to
tail-call optimizing the last function called by start_secondary()...
upthread are all the details.
And question is, can Jakub's suggestions above prevent tail-call
optimization on clang too and how reliable and future proof would that
be if we end up going that way?
Thx.
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
On Fri, Apr 17, 2020 at 10:58:59AM +0200, Jakub Jelinek wrote:
> Or go with the for (;;);, I don't think any compiler optimizes those away;
> GCC 10 for C++ can optimize away infinite loops that have some conditional
> exit because the language guarantees forward progress, but the C language
> rules are different and for unconditional infinite loops GCC doesn't
> optimize them away even if explicitly asked to -ffinite-loops.
'Funnily' there are people building the kernel with C++ :/
On Fri, Apr 17, 2020 at 09:57:39AM +0200, Borislav Petkov wrote:
> Yeah, Peter and I have been discussing something like the below
> yesterday. I don't like the additional exports too much but would
> disable stack protector only for the one function...
I did do promise you bike-shedding... so here goes :-)
> -obj-$(CONFIG_SMP) += smpboot.o
> +
> +nostackprot := $(call cc-option, -fno-stack-protector)
> +CFLAGS_smpboot_aux.o := $(nostackprot)
> +
> +smpboot_all-y := smpboot.o smpboot_aux.o
So how about we call that file: smpboot_nostack.c or, since it only has
the one function in: start_secondary.c ?
On Fri, Apr 17, 2020 at 2:09 AM Borislav Petkov <[email protected]> wrote:
>
> On Fri, Apr 17, 2020 at 10:58:59AM +0200, Jakub Jelinek wrote:
> > On Fri, Apr 17, 2020 at 10:42:24AM +0200, Borislav Petkov wrote:
> > > On Fri, Apr 17, 2020 at 10:07:26AM +0200, Jakub Jelinek wrote:
> > > > If you want minimal changes, you can as I said earlier either
> > > > mark cpu_startup_entry noreturn (in the declaration in some header so that
> > > > smpboot.c sees it), or you could add something after the cpu_startup_entry
> > > > call to ensure it is not tail call optimized (e.g. just
> > > > /* Prevent tail call to cpu_startup_entry because the stack
> > > > protector guard has been changed in the middle of this function
> > > > and must not be checked before tail calling another function. */
> > > > asm ("");
> > >
> > > That sounds ok-ish to me too.
> > >
> > > I know you probably can't tell the future :) but what stops gcc from
> > > doing the tail-call optimization in the future?
> > >
> > > Or are optimization decisions behind an inline asm a no-no and will
> > > pretty much always stay that way?
> >
> > GCC intentionally treats asm as a black box, the only thing which it does
Yep, that's how I would describe how LLVM see's inline asm, too.
> > with it is: non-volatile asm (but asm without outputs is implicitly
> > volatile) can be CSEd, and if the compiler needs to estimate size, it
> > uses some heuristics by counting ; and newlines.
> > And it will stay this way.
I recently implemented parsing support for `asm inline` in Clang; I
could have sworn I saw code in LLVM parsing newlines for a size
estimate years ago, but when implementing `asm inline`, I couldn't
find it. And test cases I wrote that used the C preprocessor to
create thousand+ line inline asm strings would always be inlined,
regardless of the `inline` asm qualifier.
Not sure about implied volatility (...inner stock trader had a laugh
at that...) for output-less asm statements.
> >
> > > And I hope the clang folks don't come around and say, err, nope, we're
> > > much more aggressive here.
> >
> > Unlike GCC, I think clang uses the builtin assembler to parse the string,
> > but don't know if it still treats the asms more like black boxes or not.
> > Certainly there is a lot of code in the wild that uses inline asm
> > as optimization barriers, so if it doesn't, then it would cause a lot of
> > problems.
> >
> > Or go with the for (;;);, I don't think any compiler optimizes those away;
> > GCC 10 for C++ can optimize away infinite loops that have some conditional
> > exit because the language guarantees forward progress, but the C language
> > rules are different and for unconditional infinite loops GCC doesn't
> > optimize them away even if explicitly asked to -ffinite-loops.
>
> Lemme add Nick for clang for an opinion:
>
> Nick, we're discussing what would be the cleanest and future-proof
> way to disable stack protector for the function in the kernel which
Oh, this reminds me of commit d0a8d9378d16 ("x86/paravirt: Make
native_save_fl() extern inline"), where the insertion of stack guards
was also causing some pain.
The cleanest solution would be to have function attributes that say
"yes, I know I said -fstack-protector*, but for this one lone function
I really need -fno-stack-protector. I know what I'm doing and accept
whatever the consequences are." But maybe the attribute would be
shorter than all that? :P
Compared to playing games with each other's inlining heuristics, that
would be the cleanest and future-proof solution. (Then we can even
revert d0a8d9378d16, and use such a function attribute. I somehow
prefer gnu_inline's semantics to ISO C99's extern inline semantics,
and simultaneously hate the problems for which it's used.)
> generates the canary value as gcc10 ends up checking that value due to
> tail-call optimizing the last function called by start_secondary()...
> upthread are all the details.
>
> And question is, can Jakub's suggestions above prevent tail-call
> optimization on clang too and how reliable and future proof would that
> be if we end up going that way?
Sorry, I don't quite follow. The idea is that an empty asm statement
in foo() should prevent foo() from being inlined into bar()?
https://godbolt.org/z/7xBRGY
--
Thanks,
~Nick Desaulniers
On Fri, Apr 17, 2020 at 11:15 AM Nick Desaulniers
<[email protected]> wrote:
>
> On Fri, Apr 17, 2020 at 2:09 AM Borislav Petkov <[email protected]> wrote:
> >
> > On Fri, Apr 17, 2020 at 10:58:59AM +0200, Jakub Jelinek wrote:
> > > On Fri, Apr 17, 2020 at 10:42:24AM +0200, Borislav Petkov wrote:
> > > > On Fri, Apr 17, 2020 at 10:07:26AM +0200, Jakub Jelinek wrote:
> > > > > If you want minimal changes, you can as I said earlier either
> > > > > mark cpu_startup_entry noreturn (in the declaration in some header so that
> > > > > smpboot.c sees it), or you could add something after the cpu_startup_entry
> > > > > call to ensure it is not tail call optimized (e.g. just
> > > > > /* Prevent tail call to cpu_startup_entry because the stack
> > > > > protector guard has been changed in the middle of this function
> > > > > and must not be checked before tail calling another function. */
> > > > > asm ("");
> > > >
> > > > That sounds ok-ish to me too.
> > > >
> > > > I know you probably can't tell the future :) but what stops gcc from
> > > > doing the tail-call optimization in the future?
> > > >
> > > > Or are optimization decisions behind an inline asm a no-no and will
> > > > pretty much always stay that way?
> > >
> > > GCC intentionally treats asm as a black box, the only thing which it does
>
> Yep, that's how I would describe how LLVM see's inline asm, too.
>
> > > with it is: non-volatile asm (but asm without outputs is implicitly
> > > volatile) can be CSEd, and if the compiler needs to estimate size, it
> > > uses some heuristics by counting ; and newlines.
> > > And it will stay this way.
>
> I recently implemented parsing support for `asm inline` in Clang; I
> could have sworn I saw code in LLVM parsing newlines for a size
> estimate years ago, but when implementing `asm inline`, I couldn't
> find it. And test cases I wrote that used the C preprocessor to
> create thousand+ line inline asm strings would always be inlined,
> regardless of the `inline` asm qualifier.
>
> Not sure about implied volatility (...inner stock trader had a laugh
> at that...) for output-less asm statements.
>
> > >
> > > > And I hope the clang folks don't come around and say, err, nope, we're
> > > > much more aggressive here.
> > >
> > > Unlike GCC, I think clang uses the builtin assembler to parse the string,
> > > but don't know if it still treats the asms more like black boxes or not.
> > > Certainly there is a lot of code in the wild that uses inline asm
> > > as optimization barriers, so if it doesn't, then it would cause a lot of
> > > problems.
> > >
> > > Or go with the for (;;);, I don't think any compiler optimizes those away;
> > > GCC 10 for C++ can optimize away infinite loops that have some conditional
> > > exit because the language guarantees forward progress, but the C language
> > > rules are different and for unconditional infinite loops GCC doesn't
> > > optimize them away even if explicitly asked to -ffinite-loops.
> >
> > Lemme add Nick for clang for an opinion:
> >
> > Nick, we're discussing what would be the cleanest and future-proof
> > way to disable stack protector for the function in the kernel which
>
> Oh, this reminds me of commit d0a8d9378d16 ("x86/paravirt: Make
> native_save_fl() extern inline"), where the insertion of stack guards
> was also causing some pain.
>
> The cleanest solution would be to have function attributes that say
> "yes, I know I said -fstack-protector*, but for this one lone function
> I really need -fno-stack-protector. I know what I'm doing and accept
> whatever the consequences are." But maybe the attribute would be
> shorter than all that? :P
>
> Compared to playing games with each other's inlining heuristics, that
s/inlining/tail call/
> would be the cleanest and future-proof solution. (Then we can even
> revert d0a8d9378d16, and use such a function attribute. I somehow
> prefer gnu_inline's semantics to ISO C99's extern inline semantics,
> and simultaneously hate the problems for which it's used.)
>
> > generates the canary value as gcc10 ends up checking that value due to
> > tail-call optimizing the last function called by start_secondary()...
> > upthread are all the details.
> >
> > And question is, can Jakub's suggestions above prevent tail-call
> > optimization on clang too and how reliable and future proof would that
> > be if we end up going that way?
>
> Sorry, I don't quite follow. The idea is that an empty asm statement
> in foo() should prevent foo() from being inlined into bar()?
s/inlined/tail called/
> https://godbolt.org/z/7xBRGY
--
Thanks,
~Nick Desaulniers
On Fri, Apr 17, 2020 at 11:22:25AM -0700, Nick Desaulniers wrote:
> > Sorry, I don't quite follow. The idea is that an empty asm statement
> > in foo() should prevent foo() from being inlined into bar()?
>
> s/inlined/tail called/
Yeah. The thing is, the caller changes the stack protector guard base
value, so at the start of the function it saves a different value then
it compares at the end. But, the function that it calls at the end
actually doesn't return, so this isn't a problem.
If it is tail called though, the stack protector guard checking is done
before the tail call and it crashes.
If the called function is marked with noreturn attribute or _Noreturn,
at least GCC will also not tail call it and all is fine, but not sure
what LLVM does in that case.
Jakub
Ah seems we do have __attribute__((no_selector))
(https://reviews.llvm.org/D46300,
https://releases.llvm.org/7.0.0/tools/clang/docs/AttributeReference.html#no-stack-protector-clang-no-stack-protector-clang-no-stack-protector)
which differs from GCC attribute name.
I'm still catching up on the thread (and my cat is insistent about
sleeping on my lap while I'm trying to use my laptop), but I like
https://lore.kernel.org/lkml/20200417190607.GY2424@tucnak/T/#m23d197d3a66a6c7d04c5444af4f51d940895b412
if it additionally defined __no_stack_protector for compiler-clang.h.
On Fri, Apr 17, 2020 at 12:06 PM Jakub Jelinek <[email protected]> wrote:
>
> On Fri, Apr 17, 2020 at 11:22:25AM -0700, Nick Desaulniers wrote:
> > > Sorry, I don't quite follow. The idea is that an empty asm statement
> > > in foo() should prevent foo() from being inlined into bar()?
> >
> > s/inlined/tail called/
>
> Yeah. The thing is, the caller changes the stack protector guard base
> value, so at the start of the function it saves a different value then
> it compares at the end. But, the function that it calls at the end
> actually doesn't return, so this isn't a problem.
> If it is tail called though, the stack protector guard checking is done
> before the tail call and it crashes.
> If the called function is marked with noreturn attribute or _Noreturn,
> at least GCC will also not tail call it and all is fine, but not sure
> what LLVM does in that case.
Seems fine? https://godbolt.org/z/VEoEfw
(try commenting out the __attribute__((noreturn)) to observe the tail calls.
--
Thanks,
~Nick Desaulniers
On Fri, Apr 17, 2020 at 12:49 PM Nick Desaulniers
<[email protected]> wrote:
>
> Ah seems we do have __attribute__((no_selector))
Sigh, can't type today. __attribute__((no_stack_protector)). I'll
send a patch along for that.
> (https://reviews.llvm.org/D46300,
> https://releases.llvm.org/7.0.0/tools/clang/docs/AttributeReference.html#no-stack-protector-clang-no-stack-protector-clang-no-stack-protector)
> which differs from GCC attribute name.
--
Thanks,
~Nick Desaulniers
From: Peter Zijlstra
> Sent: 17 April 2020 11:38
>
> On Fri, Apr 17, 2020 at 10:58:59AM +0200, Jakub Jelinek wrote:
> > Or go with the for (;;);, I don't think any compiler optimizes those away;
> > GCC 10 for C++ can optimize away infinite loops that have some conditional
> > exit because the language guarantees forward progress, but the C language
> > rules are different and for unconditional infinite loops GCC doesn't
> > optimize them away even if explicitly asked to -ffinite-loops.
>
> 'Funnily' there are people building the kernel with C++ :/
Can't you 'make progress' by using longjmp() to exit a signal handler?
David
-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)
Hello,
On Fri, 17 Apr 2020, Nick Desaulniers wrote:
> Ah seems we do have __attribute__((no_selector))
> (https://reviews.llvm.org/D46300,
> https://releases.llvm.org/7.0.0/tools/clang/docs/AttributeReference.html#no-stack-protector-clang-no-stack-protector-clang-no-stack-protector)
> which differs from GCC attribute name.
As you will discover upthread that was tried with GCC and found
insufficient, as GCC is a bit surprising with optimize attributes: it
resets every -f option from the command line and applies only the ones
from the attributes. Including a potential -fno-omit-frame-pointer,
causing all kinds of itches :)
(The similar attribute in clang might work less surprising of course).
Ciao,
Michael.
>
> I'm still catching up on the thread (and my cat is insistent about
> sleeping on my lap while I'm trying to use my laptop), but I like
> https://lore.kernel.org/lkml/20200417190607.GY2424@tucnak/T/#m23d197d3a66a6c7d04c5444af4f51d940895b412
> if it additionally defined __no_stack_protector for compiler-clang.h.
>
> On Fri, Apr 17, 2020 at 12:06 PM Jakub Jelinek <[email protected]> wrote:
> >
> > On Fri, Apr 17, 2020 at 11:22:25AM -0700, Nick Desaulniers wrote:
> > > > Sorry, I don't quite follow. The idea is that an empty asm statement
> > > > in foo() should prevent foo() from being inlined into bar()?
> > >
> > > s/inlined/tail called/
> >
> > Yeah. The thing is, the caller changes the stack protector guard base
> > value, so at the start of the function it saves a different value then
> > it compares at the end. But, the function that it calls at the end
> > actually doesn't return, so this isn't a problem.
> > If it is tail called though, the stack protector guard checking is done
> > before the tail call and it crashes.
> > If the called function is marked with noreturn attribute or _Noreturn,
> > at least GCC will also not tail call it and all is fine, but not sure
> > what LLVM does in that case.
>
> Seems fine? https://godbolt.org/z/VEoEfw
> (try commenting out the __attribute__((noreturn)) to observe the tail calls.
>
Ok,
let's try the simple and clean fix first. Nick, would that work on LLVM
too?
And I hope this will remain working and the compiler won't jump over an
inline asm and go nuts.
Thx.
---
diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
index 3b9bf8c7e29d..06d2e16bedbb 100644
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -266,6 +266,13 @@ static void notrace start_secondary(void *unused)
wmb();
cpu_startup_entry(CPUHP_AP_ONLINE_IDLE);
+
+ /*
+ * Prevent tail call to cpu_startup_entry() because the stack protector
+ * guard has been changed in the middle of this function and must not be
+ * checked before tail calling another function.
+ */
+ asm ("");
}
/**
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
On Wed, Apr 22, 2020 at 12:23:09PM +0200, Borislav Petkov wrote:
> Ok,
>
> let's try the simple and clean fix first. Nick, would that work on LLVM
> too?
>
> And I hope this will remain working and the compiler won't jump over an
> inline asm and go nuts.
>
> Thx.
>
> ---
> diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
> index 3b9bf8c7e29d..06d2e16bedbb 100644
> --- a/arch/x86/kernel/smpboot.c
> +++ b/arch/x86/kernel/smpboot.c
> @@ -266,6 +266,13 @@ static void notrace start_secondary(void *unused)
>
> wmb();
> cpu_startup_entry(CPUHP_AP_ONLINE_IDLE);
> +
> + /*
> + * Prevent tail call to cpu_startup_entry() because the stack protector
> + * guard has been changed in the middle of this function and must not be
> + * checked before tail calling another function.
> + */
> + asm ("");
> }
You haz a whitespace issue there.
Also, can we get this in writing, signed in blood, from the various
compiler teams ;-)
On Wed, Apr 22, 2020 at 01:40:07PM +0200, Peter Zijlstra wrote:
> You haz a whitespace issue there.
Fixed.
> Also, can we get this in writing, signed in blood, from the various
> compiler teams ;-)
Yah, I wouldn't want to go fix this again in gcc11 or so. That's why I
wanted the explicit marking but let's try this first - it is too simple
to pass over without having tested it.
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
On Wed, Apr 22, 2020 at 03:49:24PM +0200, Borislav Petkov wrote:
> On Wed, Apr 22, 2020 at 01:40:07PM +0200, Peter Zijlstra wrote:
> > You haz a whitespace issue there.
>
> Fixed.
>
> > Also, can we get this in writing, signed in blood, from the various
> > compiler teams ;-)
>
> Yah, I wouldn't want to go fix this again in gcc11 or so. That's why I
> wanted the explicit marking but let's try this first - it is too simple
> to pass over without having tested it.
If virtual blood is enough, AFAIK GCC has never tried to accept volatile
inline asm (asm ("") is such; non-volatile asm such as int x; asm ("" : "=r" (x));
could be e.g. dead code eliminated) in the statements between function call and
return when deciding about what function can be tail-called or can use
tail-recursion and there are no plans to change that.
Jakub
On 4/22/20 3:55 PM, Jakub Jelinek wrote:
> On Wed, Apr 22, 2020 at 03:49:24PM +0200, Borislav Petkov wrote:
>> On Wed, Apr 22, 2020 at 01:40:07PM +0200, Peter Zijlstra wrote:
>>> You haz a whitespace issue there.
>>
>> Fixed.
>>
>>> Also, can we get this in writing, signed in blood, from the various
>>> compiler teams ;-)
>>
>> Yah, I wouldn't want to go fix this again in gcc11 or so. That's why I
>> wanted the explicit marking but let's try this first - it is too simple
>> to pass over without having tested it.
>
> If virtual blood is enough, AFAIK GCC has never tried to accept volatile
> inline asm (asm ("") is such; non-volatile asm such as int x; asm ("" : "=r" (x));
> could be e.g. dead code eliminated) in the statements between function call and
> return when deciding about what function can be tail-called or can use
> tail-recursion and there are no plans to change that.
>
> Jakub
>
>
>
One possible solution can be usage of a GCC pragma that will disable the tail-call optimization:
$ cat tail.c
int foo(int);
#pragma GCC push_options
#pragma GCC optimize("-fno-optimize-sibling-calls")
int baz(int a)
{
int r = foo(a);
return r;
}
#pragma GCC pop_options
I'm not sure if clang provides something similar (the -foptimize-sibling-calls option
is supported as well).
And as I talked to Boris, I would recommend to come up with a "configure" check
that a compiler does not optimize the key code sequence:
$ cat asm-detect.c
int foo(int a);
int bar(int a)
{
int r = foo(a);
asm ("");
return r;
}
$ gcc -O2 -c asm-detect.c -S -o/dev/stdout | grep jmp
[no output]
Martin
Hello,
On Wed, 22 Apr 2020, Martin Liška wrote:
> One possible solution can be usage of a GCC pragma that will disable the
> tail-call optimization:
>
> $ cat tail.c
> int foo(int);
>
> #pragma GCC push_options
> #pragma GCC optimize("-fno-optimize-sibling-calls")
As we determined upthread (and the reason why we even still have this
thread): the optimize attribute (and pragma) reset flags from the command
line (the case in point was -fno-omit-frame-pointer). So, that's not a
solution for now.
> And as I talked to Boris, I would recommend to come up with a
> "configure" check that a compiler does not optimize the key code
> sequence:
Right. I think the traditional asm (i.e. one without operands) is good
enough for the forseeable future from GCCs side: it relies on documented
behaviour of traditional asms, and hence would be very hard to change.
Ciao,
Michael.
On Wed, Apr 22, 2020 at 04:16:53PM +0200, Martin Liška wrote:
> And as I talked to Boris, I would recommend to come up with a "configure" check
> that a compiler does not optimize the key code sequence:
>
> $ cat asm-detect.c
> int foo(int a);
> int bar(int a)
> {
> int r = foo(a);
> asm ("");
> return r;
> }
>
> $ gcc -O2 -c asm-detect.c -S -o/dev/stdout | grep jmp
> [no output]
That is a good test to run at the beginning of the compilation I guess.
Without the asm("") it produces:
bar:
.LFB0:
.cfi_startproc
jmp foo@PLT
.cfi_endproc
I'd like for LLVM folks to confirm that this is a good test for LLVM too
Trying that here with clang gives:
bar: # @bar
.cfi_startproc
# %bb.0:
jmp foo # TAILCALL
.Lfunc_end0:
so this *looks* like it would work with LLVM too but I might be missing
something...
Thx.
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
On Wed, Apr 22, 2020 at 06:53:39PM +0200, Borislav Petkov wrote:
> > $ cat asm-detect.c
> > int foo(int a);
> > int bar(int a)
> > {
> > int r = foo(a);
> > asm ("");
> > return r;
> > }
> >
> > $ gcc -O2 -c asm-detect.c -S -o/dev/stdout | grep jmp
> > [no output]
>
> That is a good test to run at the beginning of the compilation I guess.
If it is x86 specific, it can work, though I'd suggest -o - instead of
-o/dev/stdout and being more picky on where you want to match the jmp,
as you don't want to match it in comments, or say filenames in the asm file
etc. E.g. require that jmp must be preceeded on the line only by whitespace
and followed by whitespace.
Jakub
On Wed, Apr 22, 2020 at 9:53 AM Borislav Petkov <[email protected]> wrote:
>
> On Wed, Apr 22, 2020 at 04:16:53PM +0200, Martin Liška wrote:
> > And as I talked to Boris, I would recommend to come up with a "configure" check
> > that a compiler does not optimize the key code sequence:
> >
> > $ cat asm-detect.c
> > int foo(int a);
> > int bar(int a)
> > {
> > int r = foo(a);
> > asm ("");
> > return r;
> > }
> >
> > $ gcc -O2 -c asm-detect.c -S -o/dev/stdout | grep jmp
> > [no output]
>
> That is a good test to run at the beginning of the compilation I guess.
>
> Without the asm("") it produces:
>
> bar:
> .LFB0:
> .cfi_startproc
> jmp foo@PLT
> .cfi_endproc
>
> I'd like for LLVM folks to confirm that this is a good test for LLVM too
> Trying that here with clang gives:
>
> bar: # @bar
> .cfi_startproc
> # %bb.0:
> jmp foo # TAILCALL
> .Lfunc_end0:
>
> so this *looks* like it would work with LLVM too but I might be missing
> something...
LGTM https://godbolt.org/z/ExtHx7
--
Thanks,
~Nick Desaulniers
On Wed, Apr 22, 2020 at 3:23 AM Borislav Petkov <[email protected]> wrote:
>
> Ok,
>
> let's try the simple and clean fix first. Nick, would that work on LLVM
> too?
>
> And I hope this will remain working and the compiler won't jump over an
> inline asm and go nuts.
>
> Thx.
>
> ---
> diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
> index 3b9bf8c7e29d..06d2e16bedbb 100644
> --- a/arch/x86/kernel/smpboot.c
> +++ b/arch/x86/kernel/smpboot.c
> @@ -266,6 +266,13 @@ static void notrace start_secondary(void *unused)
>
> wmb();
> cpu_startup_entry(CPUHP_AP_ONLINE_IDLE);
> +
> + /*
> + * Prevent tail call to cpu_startup_entry() because the stack protector
> + * guard has been changed in the middle of this function and must not be
> + * checked before tail calling another function.
Can you add by whom? It's not clear to me which function call in
start_secondary modifies the stack protector guard.
Another question. Do we not want a stack protector at all in this
function? I'm not super familiar with how they work; do we not want
them at all, or simply not to check the guard?
But if we're not going to check it, I think
__attribute__((no_stack_protector)) applied to start_secondary might
be a more precise fix. Though the empty asm statement may be the most
portable at this time, and with a well specified comment, I can live
with it.
> + */
> + asm ("");
> }
>
> /**
>
> --
> Regards/Gruss,
> Boris.
>
> https://people.kernel.org/tglx/notes-about-netiquette
--
Thanks,
~Nick Desaulniers
On Wed, Apr 22, 2020 at 11:55:50AM -0700, Nick Desaulniers wrote:
> Can you add by whom? It's not clear to me which function call in
> start_secondary modifies the stack protector guard.
How's that
/*
* Prevent tail call to cpu_startup_entry() because the stack protector
* guard has been changed a couple of functions up, in
* boot_init_stack_canary() and must not be checked before tail calling
* another function.
*/
asm ("");
?
> Another question. Do we not want a stack protector at all in this
> function? I'm not super familiar with how they work; do we not want
> them at all, or simply not to check the guard?
Not to check the guard. See the beginning of
arch/x86/include/asm/stackprotector.h about how they work.
> But if we're not going to check it, I think
> __attribute__((no_stack_protector)) applied to start_secondary might
> be a more precise fix.
No such attribute in gcc yet. But yes, this came up a bit upthread, you
can go back in time for details. :)
Thx.
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
On Wed, Apr 22, 2020 at 12:21 PM Borislav Petkov <[email protected]> wrote:
>
> On Wed, Apr 22, 2020 at 11:55:50AM -0700, Nick Desaulniers wrote:
> > Can you add by whom? It's not clear to me which function call in
> > start_secondary modifies the stack protector guard.
>
> How's that
>
> /*
> * Prevent tail call to cpu_startup_entry() because the stack protector
> * guard has been changed a couple of functions up, in
s/functions/statements/
or
s/functions/function calls/
Sorry to be pedantic and bikeshed a comment! *ducks*
With that you can add my:
Reviewed-by: Nick Desaulniers <[email protected]>
> * boot_init_stack_canary() and must not be checked before tail calling
> * another function.
> */
> asm ("");
>
> ?
>
> > Another question. Do we not want a stack protector at all in this
> > function? I'm not super familiar with how they work; do we not want
> > them at all, or simply not to check the guard?
>
> Not to check the guard. See the beginning of
> arch/x86/include/asm/stackprotector.h about how they work.
>
> > But if we're not going to check it, I think
> > __attribute__((no_stack_protector)) applied to start_secondary might
> > be a more precise fix.
>
> No such attribute in gcc yet. But yes, this came up a bit upthread, you
> can go back in time for details. :)
Filed: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94722
(Maybe a link to that might be helpful in the comment, for future
travelers? But I don't feel strongly about that either way, and
trust+defer to your judgement).
--
Thanks,
~Nick Desaulniers
On Wed, Apr 22, 2020 at 02:05:13PM -0700, Nick Desaulniers wrote:
> s/functions/statements/
> or
> s/functions/function calls/
>
> Sorry to be pedantic and bikeshed a comment! *ducks*
Yeah, you beter duck! :-P
> With that you can add my:
> Reviewed-by: Nick Desaulniers <[email protected]>
Ok ok, will fix.
> Filed: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94722
> (Maybe a link to that might be helpful in the comment, for future
> travelers? But I don't feel strongly about that either way, and
> trust+defer to your judgement).
Here's an answer to that, also from upthread:
https://lkml.kernel.org/r/20200316180303.GR2156@tucnak
Btw lore.kernel.org has this cool mbox.gz feature:
https://lore.kernel.org/lkml/20200316180303.GR2156@tucnak/t.mbox.gz
This way, you can grep the whole thread, open it with a proper mail
program etc. Very useful for catching up on threads.
Thx.
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
On Wed, Apr 22, 2020 at 2:26 PM Borislav Petkov <[email protected]> wrote:
>
> On Wed, Apr 22, 2020 at 02:05:13PM -0700, Nick Desaulniers wrote:
> > s/functions/statements/
> > or
> > s/functions/function calls/
> >
> > Sorry to be pedantic and bikeshed a comment! *ducks*
>
> Yeah, you better duck! :-P
*gunshots ring out, across the ghetto that is LKML, reminding you that
you've reposted in the wrong text formatting*
> Btw lore.kernel.org has this cool mbox.gz feature:
>
> https://lore.kernel.org/lkml/20200316180303.GR2156@tucnak/t.mbox.gz
>
> This way, you can grep the whole thread, open it with a proper mail
> program etc. Very useful for catching up on threads.
Ah, neat, I see the "mbox.gz" link near the top of a thread near
"Thread overview".
Would be helpful if I actually took the time to RTFM; just dealing
with a lot of constant nonsense regressions lately. One day, I'll be
able to sip margaritas one the roof without anyone noticing I'm not
working...one day.
--
Thanks,
~Nick Desaulniers
On Wed, Apr 22, 2020 at 03:57:34PM -0700, Nick Desaulniers wrote:
> *gunshots ring out, across the ghetto that is LKML, reminding you that
> you've reposted in the wrong text formatting*
Ghetto? You mean more like a bazaar with a lot and a lot of loud people
in it... :-)))
> Would be helpful if I actually took the time to RTFM; just dealing
> with a lot of constant nonsense regressions lately. One day, I'll be
> able to sip margaritas one the roof without anyone noticing I'm not
> working...one day.
Dude, you stole my dream!
:-)
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
Ok,
I have tried to summarize our odyssey so far and here's what I came up
with. Just built latest gcc from the git repo and it seems to work.
Next I need to come up with a slick way of testing the compiler...
Thx.
---
From: Borislav Petkov <[email protected]>
... or the odyssey of trying to disable the stack protector for the
function which generates the stack canary value.
The whole story started with Sergei reporting a boot crash with a kernel
built with gcc-10:
Kernel panic — not syncing: stack-protector: Kernel stack is corrupted in: start_secondary
CPU: 1 PID: 0 Comm: swapper/1 Not tainted 5.6.0-rc5—00235—gfffb08b37df9 #139
Hardware name: Gigabyte Technology Co., Ltd. To be filled by O.E.M./H77M—D3H, BIOS F12 11/14/2013
Call Trace:
dump_stack
panic
? start_secondary
__stack_chk_fail
start_secondary
secondary_startup_64
-—-[ end Kernel panic — not syncing: stack—protector: Kernel stack is corrupted in: start_secondary
This happens because gcc-10 tail-call optimizes the last function call
in start_secondary() - cpu_startup_entry() - and thus emits a stack
canary check which fails because the canary value changes after the
boot_init_stack_canary() call.
To fix that, the initial attempt was to mark the one function which
generates the stack canary with:
__attribute__((optimize("-fno-stack-protector"))) ... start_secondary(void *unused)
however, using the optimize attribute doesn't work cumulatively
as the attribute does not add to but rather replaces previously
supplied optimization options - roughly all -fxxx options.
The key one among them being -fno-omit-frame-pointer and thus leading to
not present frame pointer - frame pointer which the kernel needs.
The next attempt to prevent compilers from tail-call optimizing
the last function call cpu_startup_entry(), shy of carving out
start_secondary() into a separate compilation unit and building it with
-fno-stack-protector, is this one.
The current solution is short and sweet, and reportedly, is supported by
both compilers so let's see how far we'll get this time.
Reported-by: Sergei Trofimovich <[email protected]>
Signed-off-by: Borislav Petkov <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
---
arch/x86/kernel/smpboot.c | 8 ++++++++
1 file changed, 8 insertions(+)
diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
index 3b9bf8c7e29d..e9f44727fccd 100644
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -266,6 +266,14 @@ static void notrace start_secondary(void *unused)
wmb();
cpu_startup_entry(CPUHP_AP_ONLINE_IDLE);
+
+ /*
+ * Prevent tail call to cpu_startup_entry() because the stack protector
+ * guard has been changed a couple of functions up, in
+ * boot_init_stack_canary() and must not be checked before tail calling
+ * another function.
+ */
+ asm ("");
}
/**
--
2.21.0
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
On Thu, Apr 23, 2020 at 06:12:24PM +0200, Borislav Petkov wrote:
> Ok,
>
> I have tried to summarize our odyssey so far and here's what I came up
> with. Just built latest gcc from the git repo and it seems to work.
>
> Next I need to come up with a slick way of testing the compiler...
Maybe something like this. Seems to work with both.
---
From: Borislav Petkov <[email protected]>
Date: Thu, 23 Apr 2020 19:28:28 +0200
Subject: [PATCH] Check compiler
Signed-off-by: Borislav Petkov <[email protected]>
---
arch/x86/Makefile | 4 ++++
scripts/x86-check-compiler.sh | 9 +++++++++
2 files changed, 13 insertions(+)
create mode 100755 scripts/x86-check-compiler.sh
diff --git a/arch/x86/Makefile b/arch/x86/Makefile
index 00e378de8bc0..38d3eec5062e 100644
--- a/arch/x86/Makefile
+++ b/arch/x86/Makefile
@@ -1,6 +1,10 @@
# SPDX-License-Identifier: GPL-2.0
# Unified Makefile for i386 and x86_64
+# Check the compiler
+sane_compiler := $(shell $(srctree)/scripts/x86-check-compiler.sh $(CC))
+$(if $(sane_compiler),$(error $(CC) check failed. Aborting),)
+
# select defconfig based on actual architecture
ifeq ($(ARCH),x86)
ifeq ($(shell uname -m),x86_64)
diff --git a/scripts/x86-check-compiler.sh b/scripts/x86-check-compiler.sh
new file mode 100755
index 000000000000..b2b5b54b6939
--- /dev/null
+++ b/scripts/x86-check-compiler.sh
@@ -0,0 +1,9 @@
+#!/bin/sh
+# SPDX-License-Identifier: GPL-2.0
+
+# Check whether the compiler tail-call optimizes across an asm() statement.
+# Fail the build if it does.
+
+echo "int foo(int a); int bar(int a) { int r = foo(a); asm(\"\"); return r; }" |\
+ $* -O2 -x c -c -S - -o - 2>/dev/null |\
+ grep -E "^[[:blank:]]+jmp[[:blank:]]+.*"
--
2.21.0
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
On Thu, Apr 23, 2020 at 11:02:09AM -0700, Nick Desaulniers wrote:
> Reviewed-by: Nick Desaulniers <[email protected]>
> Tested-by: Nick Desaulniers <[email protected]>
>
> It's too bad that $(CC) isn't exported yet; IIUC we include the arch
> specific Makefiles then later export $(CC). If that was the case, we
> could just use $(CC) in the shell script, rather than passing it along
> as an argument. Oh well.
Aha, so that's why others pass it. I used
gcc-x86_64-has-stack-protector.sh as an example to slap that one
together.
Below new version with proper commit message this time.
> If I add `echo "hello world"` to the end of
> scripts/x86-check-compiler.sh to verify this stops a build, this is
> the error message I would observe:
> arch/x86/Makefile:6: *** clang check failed. Aborting. Stop.
Right, or you can comment out the asm("") in the script and then it
matches the "jmp" and thus fails the build. As it should be.
Thx.
---
From: Borislav Petkov <[email protected]>
Date: Thu, 23 Apr 2020 19:28:28 +0200
Subject: [PATCH] x86: Check whether the compiler is sane
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Add a check script to verify whether the compiler is sane. This is
x86-only for now and checks one thing only but should be useful for more
checks in the future.
Suggested-by: Martin Liška <[email protected]>
Signed-off-by: Borislav Petkov <[email protected]>
Reviewed-by: Nick Desaulniers <[email protected]>
Tested-by: Nick Desaulniers <[email protected]>
---
arch/x86/Makefile | 4 ++++
scripts/x86-check-compiler.sh | 9 +++++++++
2 files changed, 13 insertions(+)
create mode 100755 scripts/x86-check-compiler.sh
diff --git a/arch/x86/Makefile b/arch/x86/Makefile
index 00e378de8bc0..38d3eec5062e 100644
--- a/arch/x86/Makefile
+++ b/arch/x86/Makefile
@@ -1,6 +1,10 @@
# SPDX-License-Identifier: GPL-2.0
# Unified Makefile for i386 and x86_64
+# Check the compiler
+sane_compiler := $(shell $(srctree)/scripts/x86-check-compiler.sh $(CC))
+$(if $(sane_compiler),$(error $(CC) check failed. Aborting),)
+
# select defconfig based on actual architecture
ifeq ($(ARCH),x86)
ifeq ($(shell uname -m),x86_64)
diff --git a/scripts/x86-check-compiler.sh b/scripts/x86-check-compiler.sh
new file mode 100755
index 000000000000..b2b5b54b6939
--- /dev/null
+++ b/scripts/x86-check-compiler.sh
@@ -0,0 +1,9 @@
+#!/bin/sh
+# SPDX-License-Identifier: GPL-2.0
+
+# Check whether the compiler tail-call optimizes across an asm() statement.
+# Fail the build if it does.
+
+echo "int foo(int a); int bar(int a) { int r = foo(a); asm(\"\"); return r; }" |\
+ $* -O2 -x c -c -S - -o - 2>/dev/null |\
+ grep -E "^[[:blank:]]+jmp[[:blank:]]+.*"
--
2.21.0
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
On Thu, Apr 23, 2020 at 06:12:24PM +0200, Borislav Petkov wrote:
> Ok,
>
> I have tried to summarize our odyssey so far and here's what I came up
> with. Just built latest gcc from the git repo and it seems to work.
>
> Next I need to come up with a slick way of testing the compiler...
>
> Thx.
>
> ---
> From: Borislav Petkov <[email protected]>
>
> ... or the odyssey of trying to disable the stack protector for the
> function which generates the stack canary value.
>
> The whole story started with Sergei reporting a boot crash with a kernel
> built with gcc-10:
>
> Kernel panic — not syncing: stack-protector: Kernel stack is corrupted in: start_secondary
> CPU: 1 PID: 0 Comm: swapper/1 Not tainted 5.6.0-rc5—00235—gfffb08b37df9 #139
> Hardware name: Gigabyte Technology Co., Ltd. To be filled by O.E.M./H77M—D3H, BIOS F12 11/14/2013
> Call Trace:
> dump_stack
> panic
> ? start_secondary
> __stack_chk_fail
> start_secondary
> secondary_startup_64
> -—-[ end Kernel panic — not syncing: stack—protector: Kernel stack is corrupted in: start_secondary
>
> This happens because gcc-10 tail-call optimizes the last function call
> in start_secondary() - cpu_startup_entry() - and thus emits a stack
> canary check which fails because the canary value changes after the
> boot_init_stack_canary() call.
>
> To fix that, the initial attempt was to mark the one function which
> generates the stack canary with:
>
> __attribute__((optimize("-fno-stack-protector"))) ... start_secondary(void *unused)
>
> however, using the optimize attribute doesn't work cumulatively
> as the attribute does not add to but rather replaces previously
> supplied optimization options - roughly all -fxxx options.
>
> The key one among them being -fno-omit-frame-pointer and thus leading to
> not present frame pointer - frame pointer which the kernel needs.
>
> The next attempt to prevent compilers from tail-call optimizing
> the last function call cpu_startup_entry(), shy of carving out
> start_secondary() into a separate compilation unit and building it with
> -fno-stack-protector, is this one.
>
> The current solution is short and sweet, and reportedly, is supported by
> both compilers so let's see how far we'll get this time.
>
> Reported-by: Sergei Trofimovich <[email protected]>
> Signed-off-by: Borislav Petkov <[email protected]>
Reviewed-by: Kees Cook <[email protected]>
I'm glad to have the gcc bug opened for the function attribute; thanks
Nick! I needed that for the syscall entry code, but instead went with
a compilation-unit down-grade[1]. I'd much prefer the attribute. :)
-Kees
[1] https://lore.kernel.org/lkml/[email protected]/
> Link: https://lkml.kernel.org/r/[email protected]
> ---
> arch/x86/kernel/smpboot.c | 8 ++++++++
> 1 file changed, 8 insertions(+)
>
> diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
> index 3b9bf8c7e29d..e9f44727fccd 100644
> --- a/arch/x86/kernel/smpboot.c
> +++ b/arch/x86/kernel/smpboot.c
> @@ -266,6 +266,14 @@ static void notrace start_secondary(void *unused)
>
> wmb();
> cpu_startup_entry(CPUHP_AP_ONLINE_IDLE);
> +
> + /*
> + * Prevent tail call to cpu_startup_entry() because the stack protector
> + * guard has been changed a couple of functions up, in
> + * boot_init_stack_canary() and must not be checked before tail calling
> + * another function.
> + */
> + asm ("");
> }
>
> /**
> --
> 2.21.0
>
>
> --
> Regards/Gruss,
> Boris.
>
> https://people.kernel.org/tglx/notes-about-netiquette
--
Kees Cook
On Thu, Apr 23, 2020 at 10:31 AM Borislav Petkov <[email protected]> wrote:
>
> On Thu, Apr 23, 2020 at 06:12:24PM +0200, Borislav Petkov wrote:
> > Ok,
> >
> > I have tried to summarize our odyssey so far and here's what I came up
> > with. Just built latest gcc from the git repo and it seems to work.
> >
> > Next I need to come up with a slick way of testing the compiler...
>
> Maybe something like this. Seems to work with both.
>
> ---
> From: Borislav Petkov <[email protected]>
> Date: Thu, 23 Apr 2020 19:28:28 +0200
> Subject: [PATCH] Check compiler
>
> Signed-off-by: Borislav Petkov <[email protected]>
Reviewed-by: Nick Desaulniers <[email protected]>
Tested-by: Nick Desaulniers <[email protected]>
It's too bad that $(CC) isn't exported yet; IIUC we include the arch
specific Makefiles then later export $(CC). If that was the case, we
could just use $(CC) in the shell script, rather than passing it along
as an argument. Oh well.
> ---
> arch/x86/Makefile | 4 ++++
> scripts/x86-check-compiler.sh | 9 +++++++++
> 2 files changed, 13 insertions(+)
> create mode 100755 scripts/x86-check-compiler.sh
>
> diff --git a/arch/x86/Makefile b/arch/x86/Makefile
> index 00e378de8bc0..38d3eec5062e 100644
> --- a/arch/x86/Makefile
> +++ b/arch/x86/Makefile
> @@ -1,6 +1,10 @@
> # SPDX-License-Identifier: GPL-2.0
> # Unified Makefile for i386 and x86_64
>
> +# Check the compiler
> +sane_compiler := $(shell $(srctree)/scripts/x86-check-compiler.sh $(CC))
> +$(if $(sane_compiler),$(error $(CC) check failed. Aborting),)
If I add `echo "hello world"` to the end of
scripts/x86-check-compiler.sh to verify this stops a build, this is
the error message I would observe:
arch/x86/Makefile:6: *** clang check failed. Aborting. Stop.
> +
> # select defconfig based on actual architecture
> ifeq ($(ARCH),x86)
> ifeq ($(shell uname -m),x86_64)
> diff --git a/scripts/x86-check-compiler.sh b/scripts/x86-check-compiler.sh
> new file mode 100755
> index 000000000000..b2b5b54b6939
> --- /dev/null
> +++ b/scripts/x86-check-compiler.sh
> @@ -0,0 +1,9 @@
> +#!/bin/sh
> +# SPDX-License-Identifier: GPL-2.0
> +
> +# Check whether the compiler tail-call optimizes across an asm() statement.
> +# Fail the build if it does.
> +
> +echo "int foo(int a); int bar(int a) { int r = foo(a); asm(\"\"); return r; }" |\
> + $* -O2 -x c -c -S - -o - 2>/dev/null |\
> + grep -E "^[[:blank:]]+jmp[[:blank:]]+.*"
> --
> 2.21.0
>
> --
> Regards/Gruss,
> Boris.
>
> https://people.kernel.org/tglx/notes-about-netiquette
--
Thanks,
~Nick Desaulniers
On Thu, Apr 23, 2020 at 06:12:24PM +0200, Borislav Petkov wrote:
> Ok,
>
> I have tried to summarize our odyssey so far and here's what I came up
> with. Just built latest gcc from the git repo and it seems to work.
>
> Next I need to come up with a slick way of testing the compiler...
>
> Thx.
>
> ---
> From: Borislav Petkov <[email protected]>
>
> ... or the odyssey of trying to disable the stack protector for the
> function which generates the stack canary value.
>
> The whole story started with Sergei reporting a boot crash with a kernel
> built with gcc-10:
>
> Kernel panic — not syncing: stack-protector: Kernel stack is corrupted in: start_secondary
> CPU: 1 PID: 0 Comm: swapper/1 Not tainted 5.6.0-rc5—00235—gfffb08b37df9 #139
> Hardware name: Gigabyte Technology Co., Ltd. To be filled by O.E.M./H77M—D3H, BIOS F12 11/14/2013
> Call Trace:
> dump_stack
> panic
> ? start_secondary
> __stack_chk_fail
> start_secondary
> secondary_startup_64
> -—-[ end Kernel panic — not syncing: stack—protector: Kernel stack is corrupted in: start_secondary
>
> This happens because gcc-10 tail-call optimizes the last function call
> in start_secondary() - cpu_startup_entry() - and thus emits a stack
> canary check which fails because the canary value changes after the
> boot_init_stack_canary() call.
>
> To fix that, the initial attempt was to mark the one function which
> generates the stack canary with:
>
> __attribute__((optimize("-fno-stack-protector"))) ... start_secondary(void *unused)
>
> however, using the optimize attribute doesn't work cumulatively
> as the attribute does not add to but rather replaces previously
> supplied optimization options - roughly all -fxxx options.
>
> The key one among them being -fno-omit-frame-pointer and thus leading to
> not present frame pointer - frame pointer which the kernel needs.
>
> The next attempt to prevent compilers from tail-call optimizing
> the last function call cpu_startup_entry(), shy of carving out
> start_secondary() into a separate compilation unit and building it with
> -fno-stack-protector, is this one.
>
> The current solution is short and sweet, and reportedly, is supported by
> both compilers so let's see how far we'll get this time.
>
> Reported-by: Sergei Trofimovich <[email protected]>
> Signed-off-by: Borislav Petkov <[email protected]>
> Link: https://lkml.kernel.org/r/[email protected]
> ---
> arch/x86/kernel/smpboot.c | 8 ++++++++
> 1 file changed, 8 insertions(+)
>
> diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
> index 3b9bf8c7e29d..e9f44727fccd 100644
> --- a/arch/x86/kernel/smpboot.c
> +++ b/arch/x86/kernel/smpboot.c
> @@ -266,6 +266,14 @@ static void notrace start_secondary(void *unused)
>
> wmb();
> cpu_startup_entry(CPUHP_AP_ONLINE_IDLE);
> +
> + /*
> + * Prevent tail call to cpu_startup_entry() because the stack protector
> + * guard has been changed a couple of functions up, in
> + * boot_init_stack_canary() and must not be checked before tail calling
> + * another function.
> + */
> + asm ("");
> }
>
> /**
> --
> 2.21.0
>
>
> --
> Regards/Gruss,
> Boris.
>
> https://people.kernel.org/tglx/notes-about-netiquette
The comment above boot_init_stack_canary's definition should be updated
to note that it needs to be called from a function that, in addition to
not returning, either has stackprotector disabled or avoids ending in a
tail call.
There are also other calls that likely need to be fixed as well -- in
init/main.c, arch/x86/xen/smp_pv.c, and there is a powerpc version of
start_secondary in arch/powerpc/kernel/smp.c which may also be affected.
Thanks.
On Fri, Apr 24, 2020 at 09:46:57PM -0400, Arvind Sankar wrote:
> The comment above boot_init_stack_canary's definition should be updated
> to note that it needs to be called from a function that, in addition to
> not returning, either has stackprotector disabled or avoids ending in a
> tail call.
How's that?
diff --git a/arch/x86/include/asm/stackprotector.h b/arch/x86/include/asm/stackprotector.h
index 91e29b6a86a5..237a54f60d6b 100644
--- a/arch/x86/include/asm/stackprotector.h
+++ b/arch/x86/include/asm/stackprotector.h
@@ -55,8 +55,12 @@
/*
* Initialize the stackprotector canary value.
*
- * NOTE: this must only be called from functions that never return,
- * and it must always be inlined.
+ * NOTE: this must only be called from functions that never return, it must
+ * always be inlined and it should be called from a compilation unit for
+ * which stack protector is disabled.
+ *
+ * Alternatively, the caller should not end with a function call which gets
+ * tail-call optimized as that would lead to checking a modified canary value.
*/
static __always_inline void boot_init_stack_canary(void)
{
> There are also other calls that likely need to be fixed as well -- in
> init/main.c, arch/x86/xen/smp_pv.c, and there is a powerpc version of
> start_secondary in arch/powerpc/kernel/smp.c which may also be affected.
Yes, there was an attempt to fix former:
https://lkml.kernel.org/r/[email protected]
I probably should point the folks to this thread. CCed.
Boris O, Jürgen, I'm guessing I should fix cpu_bringup_and_idle() too,
see:
https://lkml.kernel.org/r/[email protected]
or do you prefer a separate patch?
Thx.
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
On 25.04.20 10:57, Borislav Petkov wrote:
> On Fri, Apr 24, 2020 at 09:46:57PM -0400, Arvind Sankar wrote:
>> The comment above boot_init_stack_canary's definition should be updated
>> to note that it needs to be called from a function that, in addition to
>> not returning, either has stackprotector disabled or avoids ending in a
>> tail call.
>
> How's that?
>
> diff --git a/arch/x86/include/asm/stackprotector.h b/arch/x86/include/asm/stackprotector.h
> index 91e29b6a86a5..237a54f60d6b 100644
> --- a/arch/x86/include/asm/stackprotector.h
> +++ b/arch/x86/include/asm/stackprotector.h
> @@ -55,8 +55,12 @@
> /*
> * Initialize the stackprotector canary value.
> *
> - * NOTE: this must only be called from functions that never return,
> - * and it must always be inlined.
> + * NOTE: this must only be called from functions that never return, it must
> + * always be inlined and it should be called from a compilation unit for
> + * which stack protector is disabled.
> + *
> + * Alternatively, the caller should not end with a function call which gets
> + * tail-call optimized as that would lead to checking a modified canary value.
> */
> static __always_inline void boot_init_stack_canary(void)
> {
>
>> There are also other calls that likely need to be fixed as well -- in
>> init/main.c, arch/x86/xen/smp_pv.c, and there is a powerpc version of
>> start_secondary in arch/powerpc/kernel/smp.c which may also be affected.
>
> Yes, there was an attempt to fix former:
>
> https://lkml.kernel.org/r/[email protected]
>
> I probably should point the folks to this thread. CCed.
>
> Boris O, Jürgen, I'm guessing I should fix cpu_bringup_and_idle() too,
> see:
>
> https://lkml.kernel.org/r/[email protected]
>
> or do you prefer a separate patch?
I'm fine with you including it in your patch.
Juergen
On Sat, Apr 25, 2020 at 10:57:59AM +0200, Borislav Petkov wrote:
> On Fri, Apr 24, 2020 at 09:46:57PM -0400, Arvind Sankar wrote:
> > The comment above boot_init_stack_canary's definition should be updated
> > to note that it needs to be called from a function that, in addition to
> > not returning, either has stackprotector disabled or avoids ending in a
> > tail call.
>
> How's that?
>
> diff --git a/arch/x86/include/asm/stackprotector.h b/arch/x86/include/asm/stackprotector.h
> index 91e29b6a86a5..237a54f60d6b 100644
> --- a/arch/x86/include/asm/stackprotector.h
> +++ b/arch/x86/include/asm/stackprotector.h
> @@ -55,8 +55,12 @@
> /*
> * Initialize the stackprotector canary value.
> *
> - * NOTE: this must only be called from functions that never return,
> - * and it must always be inlined.
> + * NOTE: this must only be called from functions that never return, it must
> + * always be inlined and it should be called from a compilation unit for
> + * which stack protector is disabled.
> + *
> + * Alternatively, the caller should not end with a function call which gets
> + * tail-call optimized as that would lead to checking a modified canary value.
> */
> static __always_inline void boot_init_stack_canary(void)
> {
I'd put the clause about stack protector being disabled and the
tail-call one together, to make clear that you still need the never
return and always inline bits. Also, this function is implemented by
multiple arch's and they all have similar comments -- might be better to
consolidate the comment in the generic (dummy) one in
include/linux/stackprotector.h laying out the restrictions that arch
implementations should follow?
>
> > There are also other calls that likely need to be fixed as well -- in
> > init/main.c, arch/x86/xen/smp_pv.c, and there is a powerpc version of
> > start_secondary in arch/powerpc/kernel/smp.c which may also be affected.
>
> Yes, there was an attempt to fix former:
>
> https://lkml.kernel.org/r/[email protected]
There's also the one in init/main.c which is used by multiple
architectures. On x86 at least, the call to arch_call_rest_init at the
end of start_kernel does not get tail-call optimized by gcc-10, but I
don't see anything that actually prevents that from happening. We should
add the asm("") there as well I think, unless the compiler guys see
something about this function that will always prevent the optimization?
Cc'ing PPC list for powerpc start_secondary.
On Sat, Apr 25, 2020 at 11:04:40AM -0400, Arvind Sankar wrote:
> I'd put the clause about stack protector being disabled and the
> tail-call one together, to make clear that you still need the never
> return and always inline bits.
Done.
> Also, this function is implemented by multiple arch's and they all
> have similar comments -- might be better to consolidate the comment in
> the generic (dummy) one in include/linux/stackprotector.h laying out
> the restrictions that arch implementations should follow?
I'm not sure gcc-10 does the same thing on other arches - I'd let gcc
guys chime in here and other arch maintainers to decide what to do.
> There's also the one in init/main.c which is used by multiple
> architectures. On x86 at least, the call to arch_call_rest_init at the
> end of start_kernel does not get tail-call optimized by gcc-10, but I
> don't see anything that actually prevents that from happening. We should
> add the asm("") there as well I think, unless the compiler guys see
> something about this function that will always prevent the optimization?
Hmm, that's what I was afraid of - having to sprinkle this around. Yah, let's
wait for compiler guys to have a look here and then maybe I'll convert that
thing to a macro called
compiler_prevent_tail_call_opt()
or so, so that it can be sprinkled around. ;-\
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
On Sat, Apr 25, 2020 at 07:31:40PM +0200, Borislav Petkov wrote:
> Hmm, that's what I was afraid of - having to sprinkle this around. Yah, let's
> wait for compiler guys to have a look here and then maybe I'll convert that
> thing to a macro called
>
> compiler_prevent_tail_call_opt()
>
> or so, so that it can be sprinkled around. ;-\
IOW, something like this (ontop) which takes care of the xen case too.
If it needs to be used by all arches, then I'll split the patch:
---
diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
index 73bf8450afa1..4f275ac7830b 100644
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -273,7 +273,7 @@ static void notrace start_secondary(void *unused)
* boot_init_stack_canary() and must not be checked before tail calling
* another function.
*/
- asm ("");
+ prevent_tail_call_optimization();
}
/**
diff --git a/arch/x86/xen/smp_pv.c b/arch/x86/xen/smp_pv.c
index 8fb8a50a28b4..f2adb63b2d7c 100644
--- a/arch/x86/xen/smp_pv.c
+++ b/arch/x86/xen/smp_pv.c
@@ -93,6 +93,7 @@ asmlinkage __visible void cpu_bringup_and_idle(void)
cpu_bringup();
boot_init_stack_canary();
cpu_startup_entry(CPUHP_AP_ONLINE_IDLE);
+ prevent_tail_call_optimization();
}
void xen_smp_intr_free_pv(unsigned int cpu)
diff --git a/include/linux/compiler.h b/include/linux/compiler.h
index 034b0a644efc..73f889f64513 100644
--- a/include/linux/compiler.h
+++ b/include/linux/compiler.h
@@ -356,4 +356,7 @@ static inline void *offset_to_ptr(const int *off)
/* &a[0] degrades to a pointer: a different type from an array */
#define __must_be_array(a) BUILD_BUG_ON_ZERO(__same_type((a), &(a)[0]))
+
+#define prevent_tail_call_optimization() asm("")
+
#endif /* __LINUX_COMPILER_H */
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
On Sat, Apr 25, 2020 at 07:31:40PM +0200, Borislav Petkov wrote:
> > There's also the one in init/main.c which is used by multiple
> > architectures. On x86 at least, the call to arch_call_rest_init at the
> > end of start_kernel does not get tail-call optimized by gcc-10, but I
> > don't see anything that actually prevents that from happening. We should
> > add the asm("") there as well I think, unless the compiler guys see
> > something about this function that will always prevent the optimization?
>
> Hmm, that's what I was afraid of - having to sprinkle this around. Yah, let's
> wait for compiler guys to have a look here and then maybe I'll convert that
> thing to a macro called
>
> compiler_prevent_tail_call_opt()
>
> or so, so that it can be sprinkled around. ;-\
That is a lot more typing then
asm("");
but more seriously, you probably should explain why you do not want a
tail call *anyway*, and in such a comment you can say that is what the
asm is for.
I don't see anything that prevents the tailcall in current code either,
fwiw.
Segher
On Sat, Apr 25, 2020 at 01:37:01PM -0500, Segher Boessenkool wrote:
> That is a lot more typing then
> asm("");
That's why a macro with a hopefully more descriptive name would be
telling more than a mere asm("").
> but more seriously, you probably should explain why you do not want a
> tail call *anyway*, and in such a comment you can say that is what the
> asm is for.
Yes, the final version will have a comment and the whole spiel. This
diff is just me polling the maintainers: "do you want this for your arch
too?" Well, the PPC maintainers only, actually.
The other call in init/main.c would be for everybody.
> I don't see anything that prevents the tailcall in current code either,
> fwiw.
Right, and I don't see a reason why gcc-10 would do that optimization on
x86 only but I better ask first.
Thx.
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
On Sat, Apr 25, 2020 at 08:53:13PM +0200, Borislav Petkov wrote:
> On Sat, Apr 25, 2020 at 01:37:01PM -0500, Segher Boessenkool wrote:
> > That is a lot more typing then
> > asm("");
>
> That's why a macro with a hopefully more descriptive name would be
> telling more than a mere asm("").
My point is that you should explain at *every use* of this why you cannot
have tail calls *there*. This is very unusual, after all.
There are *very* few places where you want to prevent tail calls, that's
why there is no attribute for it.
Segher
On Sat, Apr 25, 2020 at 02:15:49PM -0500, Segher Boessenkool wrote:
> My point is that you should explain at *every use* of this why you cannot
> have tail calls *there*. This is very unusual, after all.
>
> There are *very* few places where you want to prevent tail calls, that's
> why there is no attribute for it.
Well, there is only one reason *why* so far - to prevent the stack
canary cookie from being checked before returning from the function
which set it. That could be explained once over the macro definition so
that it can be looked up.
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
On Sat, Apr 25, 2020 at 02:15:49PM -0500, Segher Boessenkool wrote:
> On Sat, Apr 25, 2020 at 08:53:13PM +0200, Borislav Petkov wrote:
> > On Sat, Apr 25, 2020 at 01:37:01PM -0500, Segher Boessenkool wrote:
> > > That is a lot more typing then
> > > asm("");
> >
> > That's why a macro with a hopefully more descriptive name would be
> > telling more than a mere asm("").
>
> My point is that you should explain at *every use* of this why you cannot
> have tail calls *there*. This is very unusual, after all.
>
> There are *very* few places where you want to prevent tail calls, that's
> why there is no attribute for it.
>
>
> Segher
Well, there is -fno-optimize-sibling-calls, but we wouldn't be able to
use it via the optimize attribute for the same reason we couldn't use
no-stack-protector.
The following commit has been merged into the x86/build branch of tip:
Commit-ID: 73da86741e7f75c9f1b5c8a2660c90aa41840972
Gitweb: https://git.kernel.org/tip/73da86741e7f75c9f1b5c8a2660c90aa41840972
Author: Borislav Petkov <[email protected]>
AuthorDate: Thu, 23 Apr 2020 19:28:28 +02:00
Committer: Borislav Petkov <[email protected]>
CommitterDate: Mon, 27 Apr 2020 13:31:57 +02:00
x86/build: Check whether the compiler is sane
Add a check script to verify whether the compiler is sane. This is
x86-only for now and checks one thing only but should be useful for more
checks in the future.
Suggested-by: Martin Liška <[email protected]>
Signed-off-by: Borislav Petkov <[email protected]>
Reviewed-by: Nick Desaulniers <[email protected]>
Tested-by: Nick Desaulniers <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
---
arch/x86/Makefile | 4 ++++
scripts/x86-check-compiler.sh | 9 +++++++++
2 files changed, 13 insertions(+)
create mode 100755 scripts/x86-check-compiler.sh
diff --git a/arch/x86/Makefile b/arch/x86/Makefile
index 00e378d..38d3eec 100644
--- a/arch/x86/Makefile
+++ b/arch/x86/Makefile
@@ -1,6 +1,10 @@
# SPDX-License-Identifier: GPL-2.0
# Unified Makefile for i386 and x86_64
+# Check the compiler
+sane_compiler := $(shell $(srctree)/scripts/x86-check-compiler.sh $(CC))
+$(if $(sane_compiler),$(error $(CC) check failed. Aborting),)
+
# select defconfig based on actual architecture
ifeq ($(ARCH),x86)
ifeq ($(shell uname -m),x86_64)
diff --git a/scripts/x86-check-compiler.sh b/scripts/x86-check-compiler.sh
new file mode 100755
index 0000000..b2b5b54
--- /dev/null
+++ b/scripts/x86-check-compiler.sh
@@ -0,0 +1,9 @@
+#!/bin/sh
+# SPDX-License-Identifier: GPL-2.0
+
+# Check whether the compiler tail-call optimizes across an asm() statement.
+# Fail the build if it does.
+
+echo "int foo(int a); int bar(int a) { int r = foo(a); asm(\"\"); return r; }" |\
+ $* -O2 -x c -c -S - -o - 2>/dev/null |\
+ grep -E "^[[:blank:]]+jmp[[:blank:]]+.*"
The following commit has been merged into the x86/build branch of tip:
Commit-ID: f670269a42bfdd2c83a1118cc3d1b475547eac22
Gitweb: https://git.kernel.org/tip/f670269a42bfdd2c83a1118cc3d1b475547eac22
Author: Borislav Petkov <[email protected]>
AuthorDate: Wed, 22 Apr 2020 18:11:30 +02:00
Committer: Borislav Petkov <[email protected]>
CommitterDate: Mon, 27 Apr 2020 13:32:04 +02:00
x86: Fix early boot crash on gcc-10, next try
... or the odyssey of trying to disable the stack protector for the
function which generates the stack canary value.
The whole story started with Sergei reporting a boot crash with a kernel
built with gcc-10:
Kernel panic — not syncing: stack-protector: Kernel stack is corrupted in: start_secondary
CPU: 1 PID: 0 Comm: swapper/1 Not tainted 5.6.0-rc5—00235—gfffb08b37df9 #139
Hardware name: Gigabyte Technology Co., Ltd. To be filled by O.E.M./H77M—D3H, BIOS F12 11/14/2013
Call Trace:
dump_stack
panic
? start_secondary
__stack_chk_fail
start_secondary
secondary_startup_64
-—-[ end Kernel panic — not syncing: stack—protector: Kernel stack is corrupted in: start_secondary
This happens because gcc-10 tail-call optimizes the last function call
in start_secondary() - cpu_startup_entry() - and thus emits a stack
canary check which fails because the canary value changes after the
boot_init_stack_canary() call.
To fix that, the initial attempt was to mark the one function which
generates the stack canary with:
__attribute__((optimize("-fno-stack-protector"))) ... start_secondary(void *unused)
however, using the optimize attribute doesn't work cumulatively
as the attribute does not add to but rather replaces previously
supplied optimization options - roughly all -fxxx options.
The key one among them being -fno-omit-frame-pointer and thus leading to
not present frame pointer - frame pointer which the kernel needs.
The next attempt to prevent compilers from tail-call optimizing
the last function call cpu_startup_entry(), shy of carving out
start_secondary() into a separate compilation unit and building it with
-fno-stack-protector, is this one.
The current solution is short and sweet, and reportedly, is supported by
both compilers so let's see how far we'll get this time.
Reported-by: Sergei Trofimovich <[email protected]>
Signed-off-by: Borislav Petkov <[email protected]>
Reviewed-by: Nick Desaulniers <[email protected]>
Reviewed-by: Kees Cook <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
---
arch/x86/include/asm/stackprotector.h | 7 ++++++-
arch/x86/kernel/smpboot.c | 8 ++++++++
arch/x86/xen/smp_pv.c | 1 +
include/linux/compiler.h | 6 ++++++
4 files changed, 21 insertions(+), 1 deletion(-)
diff --git a/arch/x86/include/asm/stackprotector.h b/arch/x86/include/asm/stackprotector.h
index 91e29b6..9804a79 100644
--- a/arch/x86/include/asm/stackprotector.h
+++ b/arch/x86/include/asm/stackprotector.h
@@ -55,8 +55,13 @@
/*
* Initialize the stackprotector canary value.
*
- * NOTE: this must only be called from functions that never return,
+ * NOTE: this must only be called from functions that never return
* and it must always be inlined.
+ *
+ * In addition, it should be called from a compilation unit for which
+ * stack protector is disabled. Alternatively, the caller should not end
+ * with a function call which gets tail-call optimized as that would
+ * lead to checking a modified canary value.
*/
static __always_inline void boot_init_stack_canary(void)
{
diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
index fe3ab96..4f275ac 100644
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -266,6 +266,14 @@ static void notrace start_secondary(void *unused)
wmb();
cpu_startup_entry(CPUHP_AP_ONLINE_IDLE);
+
+ /*
+ * Prevent tail call to cpu_startup_entry() because the stack protector
+ * guard has been changed a couple of function calls up, in
+ * boot_init_stack_canary() and must not be checked before tail calling
+ * another function.
+ */
+ prevent_tail_call_optimization();
}
/**
diff --git a/arch/x86/xen/smp_pv.c b/arch/x86/xen/smp_pv.c
index 8fb8a50..f2adb63 100644
--- a/arch/x86/xen/smp_pv.c
+++ b/arch/x86/xen/smp_pv.c
@@ -93,6 +93,7 @@ asmlinkage __visible void cpu_bringup_and_idle(void)
cpu_bringup();
boot_init_stack_canary();
cpu_startup_entry(CPUHP_AP_ONLINE_IDLE);
+ prevent_tail_call_optimization();
}
void xen_smp_intr_free_pv(unsigned int cpu)
diff --git a/include/linux/compiler.h b/include/linux/compiler.h
index 034b0a6..732754d 100644
--- a/include/linux/compiler.h
+++ b/include/linux/compiler.h
@@ -356,4 +356,10 @@ static inline void *offset_to_ptr(const int *off)
/* &a[0] degrades to a pointer: a different type from an array */
#define __must_be_array(a) BUILD_BUG_ON_ZERO(__same_type((a), &(a)[0]))
+/*
+ * This is needed in functions which generate the stack canary, see
+ * arch/x86/kernel/smpboot.c::start_secondary() for an example.
+ */
+#define prevent_tail_call_optimization() asm("")
+
#endif /* __LINUX_COMPILER_H */
From: Borislav Petkov
> Sent: 25 April 2020 18:53
...
> IOW, something like this (ontop) which takes care of the xen case too.
> If it needs to be used by all arches, then I'll split the patch:
.
> - asm ("");
> + prevent_tail_call_optimization();
> }
One obvious implementation would be a real function call.
Which the compiler would convert into a tail call.
Just to confuse matters :-)
David
-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)
The following commit has been merged into the x86/urgent branch of tip:
Commit-ID: a9a3ed1eff3601b63aea4fb462d8b3b92c7c1e7e
Gitweb: https://git.kernel.org/tip/a9a3ed1eff3601b63aea4fb462d8b3b92c7c1e7e
Author: Borislav Petkov <[email protected]>
AuthorDate: Wed, 22 Apr 2020 18:11:30 +02:00
Committer: Borislav Petkov <[email protected]>
CommitterDate: Fri, 15 May 2020 11:48:01 +02:00
x86: Fix early boot crash on gcc-10, third try
... or the odyssey of trying to disable the stack protector for the
function which generates the stack canary value.
The whole story started with Sergei reporting a boot crash with a kernel
built with gcc-10:
Kernel panic — not syncing: stack-protector: Kernel stack is corrupted in: start_secondary
CPU: 1 PID: 0 Comm: swapper/1 Not tainted 5.6.0-rc5—00235—gfffb08b37df9 #139
Hardware name: Gigabyte Technology Co., Ltd. To be filled by O.E.M./H77M—D3H, BIOS F12 11/14/2013
Call Trace:
dump_stack
panic
? start_secondary
__stack_chk_fail
start_secondary
secondary_startup_64
-—-[ end Kernel panic — not syncing: stack—protector: Kernel stack is corrupted in: start_secondary
This happens because gcc-10 tail-call optimizes the last function call
in start_secondary() - cpu_startup_entry() - and thus emits a stack
canary check which fails because the canary value changes after the
boot_init_stack_canary() call.
To fix that, the initial attempt was to mark the one function which
generates the stack canary with:
__attribute__((optimize("-fno-stack-protector"))) ... start_secondary(void *unused)
however, using the optimize attribute doesn't work cumulatively
as the attribute does not add to but rather replaces previously
supplied optimization options - roughly all -fxxx options.
The key one among them being -fno-omit-frame-pointer and thus leading to
not present frame pointer - frame pointer which the kernel needs.
The next attempt to prevent compilers from tail-call optimizing
the last function call cpu_startup_entry(), shy of carving out
start_secondary() into a separate compilation unit and building it with
-fno-stack-protector, was to add an empty asm("").
This current solution was short and sweet, and reportedly, is supported
by both compilers but we didn't get very far this time: future (LTO?)
optimization passes could potentially eliminate this, which leads us
to the third attempt: having an actual memory barrier there which the
compiler cannot ignore or move around etc.
That should hold for a long time, but hey we said that about the other
two solutions too so...
Reported-by: Sergei Trofimovich <[email protected]>
Signed-off-by: Borislav Petkov <[email protected]>
Tested-by: Kalle Valo <[email protected]>
Cc: <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
---
arch/x86/include/asm/stackprotector.h | 7 ++++++-
arch/x86/kernel/smpboot.c | 8 ++++++++
arch/x86/xen/smp_pv.c | 1 +
include/linux/compiler.h | 6 ++++++
init/main.c | 2 ++
5 files changed, 23 insertions(+), 1 deletion(-)
diff --git a/arch/x86/include/asm/stackprotector.h b/arch/x86/include/asm/stackprotector.h
index 91e29b6..9804a79 100644
--- a/arch/x86/include/asm/stackprotector.h
+++ b/arch/x86/include/asm/stackprotector.h
@@ -55,8 +55,13 @@
/*
* Initialize the stackprotector canary value.
*
- * NOTE: this must only be called from functions that never return,
+ * NOTE: this must only be called from functions that never return
* and it must always be inlined.
+ *
+ * In addition, it should be called from a compilation unit for which
+ * stack protector is disabled. Alternatively, the caller should not end
+ * with a function call which gets tail-call optimized as that would
+ * lead to checking a modified canary value.
*/
static __always_inline void boot_init_stack_canary(void)
{
diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
index 8c89e4d..2f24c33 100644
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -266,6 +266,14 @@ static void notrace start_secondary(void *unused)
wmb();
cpu_startup_entry(CPUHP_AP_ONLINE_IDLE);
+
+ /*
+ * Prevent tail call to cpu_startup_entry() because the stack protector
+ * guard has been changed a couple of function calls up, in
+ * boot_init_stack_canary() and must not be checked before tail calling
+ * another function.
+ */
+ prevent_tail_call_optimization();
}
/**
diff --git a/arch/x86/xen/smp_pv.c b/arch/x86/xen/smp_pv.c
index 8fb8a50..f2adb63 100644
--- a/arch/x86/xen/smp_pv.c
+++ b/arch/x86/xen/smp_pv.c
@@ -93,6 +93,7 @@ asmlinkage __visible void cpu_bringup_and_idle(void)
cpu_bringup();
boot_init_stack_canary();
cpu_startup_entry(CPUHP_AP_ONLINE_IDLE);
+ prevent_tail_call_optimization();
}
void xen_smp_intr_free_pv(unsigned int cpu)
diff --git a/include/linux/compiler.h b/include/linux/compiler.h
index 034b0a6..448c91b 100644
--- a/include/linux/compiler.h
+++ b/include/linux/compiler.h
@@ -356,4 +356,10 @@ static inline void *offset_to_ptr(const int *off)
/* &a[0] degrades to a pointer: a different type from an array */
#define __must_be_array(a) BUILD_BUG_ON_ZERO(__same_type((a), &(a)[0]))
+/*
+ * This is needed in functions which generate the stack canary, see
+ * arch/x86/kernel/smpboot.c::start_secondary() for an example.
+ */
+#define prevent_tail_call_optimization() mb()
+
#endif /* __LINUX_COMPILER_H */
diff --git a/init/main.c b/init/main.c
index 1a5da2c..ad3812b 100644
--- a/init/main.c
+++ b/init/main.c
@@ -1036,6 +1036,8 @@ asmlinkage __visible void __init start_kernel(void)
/* Do the rest non-__init'ed, we're now alive */
arch_call_rest_init();
+
+ prevent_tail_call_optimization();
}
/* Call all constructor functions linked into the kernel. */