2020-09-15 18:06:15

by Borislav Petkov

[permalink] [raw]
Subject: Re: [PATCH] arch: x86: power: cpu: init %gs before __restore_processor_state (clang)

On Tue, Sep 15, 2020 at 10:26:58AM -0700, [email protected] wrote:
> From: Haitao Shan <[email protected]>
>
> This is a workaround which fixes triple fault
> in __restore_processor_state on clang when
> built with LTO.
>
> When load_TR_desc and load_mm_ldt are inlined into
> fix_processor_context due to LTO, they cause
> fix_processor_context (or in this case __restore_processor_state,
> as fix_processor_context was inlined into __restore_processor_state)
> to access the stack canary through %gs, but before
> __restore_processor_state has restored the previous value
> of %gs properly. LLVM appears to be inlining functions with stack
> protectors into functions compiled with -fno-stack-protector,
> which is likely a bug in LLVM's inliner that needs to be fixed.
>
> The LLVM bug is here: https://bugs.llvm.org/show_bug.cgi?id=47479
>
> Signed-off-by: Haitao Shan <[email protected]>
> Signed-off-by: Roman Kiryanov <[email protected]>

Ok, google guys, pls make sure you Cc LKML too as this is where *all*
patches and discussions are archived. Adding it now to Cc.

> ---
> arch/x86/power/cpu.c | 10 ++++++++++
> 1 file changed, 10 insertions(+)
>
> diff --git a/arch/x86/power/cpu.c b/arch/x86/power/cpu.c
> index db1378c6ff26..e5677adb2d28 100644
> --- a/arch/x86/power/cpu.c
> +++ b/arch/x86/power/cpu.c
> @@ -274,6 +274,16 @@ static void notrace __restore_processor_state(struct saved_context *ctxt)
> /* Needed by apm.c */
> void notrace restore_processor_state(void)
> {
> +#ifdef __clang__
> + // The following code snippet is copied from __restore_processor_state.
> + // Its purpose is to prepare GS segment before the function is called.
> +#ifdef CONFIG_X86_64
> + wrmsrl(MSR_GS_BASE, saved_context.kernelmode_gs_base);
> +#else
> + loadsegment(fs, __KERNEL_PERCPU);
> + loadsegment(gs, __KERNEL_STACK_CANARY);
> +#endif
> +#endif

Ok, so why is the kernel supposed to take yet another ugly workaround
because there's a bug in the compiler?

If it is too late to fix it there, then maybe disable LTO builds for the
buggy version only.

We had a similar discussion this week and we already have one buggy
compiler to deal with and this second one is not making it any easier...

--
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette


2020-09-15 18:42:20

by Nick Desaulniers

[permalink] [raw]
Subject: Re: [PATCH] arch: x86: power: cpu: init %gs before __restore_processor_state (clang)

On Tue, Sep 15, 2020 at 10:46 AM Borislav Petkov <[email protected]> wrote:
>
> On Tue, Sep 15, 2020 at 10:26:58AM -0700, [email protected] wrote:
> > From: Haitao Shan <[email protected]>
> >
> > This is a workaround which fixes triple fault
> > in __restore_processor_state on clang when
> > built with LTO.
> >
> > When load_TR_desc and load_mm_ldt are inlined into
> > fix_processor_context due to LTO, they cause
> > fix_processor_context (or in this case __restore_processor_state,
> > as fix_processor_context was inlined into __restore_processor_state)
> > to access the stack canary through %gs, but before
> > __restore_processor_state has restored the previous value
> > of %gs properly. LLVM appears to be inlining functions with stack
> > protectors into functions compiled with -fno-stack-protector,
> > which is likely a bug in LLVM's inliner that needs to be fixed.
> >
> > The LLVM bug is here: https://bugs.llvm.org/show_bug.cgi?id=47479
> >
> > Signed-off-by: Haitao Shan <[email protected]>
> > Signed-off-by: Roman Kiryanov <[email protected]>
>
> Ok, google guys, pls make sure you Cc LKML too as this is where *all*
> patches and discussions are archived. Adding it now to Cc.

Roman, please use ./scripts/get_maintainer.pl (in the kernel tree) for that.

>
> > ---
> > arch/x86/power/cpu.c | 10 ++++++++++
> > 1 file changed, 10 insertions(+)
> >
> > diff --git a/arch/x86/power/cpu.c b/arch/x86/power/cpu.c
> > index db1378c6ff26..e5677adb2d28 100644
> > --- a/arch/x86/power/cpu.c
> > +++ b/arch/x86/power/cpu.c
> > @@ -274,6 +274,16 @@ static void notrace __restore_processor_state(struct saved_context *ctxt)
> > /* Needed by apm.c */
> > void notrace restore_processor_state(void)
> > {
> > +#ifdef __clang__

Should be CONFIG_CC_IS_CLANG; is more canonical throughout the tree.
Or if this is only a bug when doing builds with LTO, and LTO is not
yet upstream, then maybe Sami should carry this in his series, at
least until I can fix the bug in Clang. Or guard this with the
CONFIG_LTO_CLANG config (not upstream yet; see Sami's series).

> > + // The following code snippet is copied from __restore_processor_state.
> > + // Its purpose is to prepare GS segment before the function is called.
> > +#ifdef CONFIG_X86_64
> > + wrmsrl(MSR_GS_BASE, saved_context.kernelmode_gs_base);
> > +#else
> > + loadsegment(fs, __KERNEL_PERCPU);
> > + loadsegment(gs, __KERNEL_STACK_CANARY);
> > +#endif
> > +#endif
>
> Ok, so why is the kernel supposed to take yet another ugly workaround
> because there's a bug in the compiler?

This is exactly the same code from __restore_processor_state. If it's
ugly, talk to the author of 7ee18d677989e. ;) All this patch is doing
is moving this up a call frame (though now this is effectively being
run twice).

> If it is too late to fix it there, then maybe disable LTO builds for the
> buggy version only.

We could do that, too. (We can disable LTO on a per translation unit
basis in KBuild). Note the author of the bug report linked above. :^P
"Revenge of the stack protector"

>
> We had a similar discussion this week and we already have one buggy
> compiler to deal with and this second one is not making it any easier...
--
Thanks,
~Nick Desaulniers

2020-09-15 21:04:53

by Arvind Sankar

[permalink] [raw]
Subject: Re: [PATCH] arch: x86: power: cpu: init %gs before __restore_processor_state (clang)

On Tue, Sep 15, 2020 at 11:00:30AM -0700, Nick Desaulniers wrote:
> On Tue, Sep 15, 2020 at 10:46 AM Borislav Petkov <[email protected]> wrote:
> >
> > On Tue, Sep 15, 2020 at 10:26:58AM -0700, [email protected] wrote:
> > > From: Haitao Shan <[email protected]>
> > >
> > > This is a workaround which fixes triple fault
> > > in __restore_processor_state on clang when
> > > built with LTO.
> > >
> > > When load_TR_desc and load_mm_ldt are inlined into
> > > fix_processor_context due to LTO, they cause

Does this apply to load_TR_desc()? That is an inline function even
without LTO, no?

> > > fix_processor_context (or in this case __restore_processor_state,
> > > as fix_processor_context was inlined into __restore_processor_state)
> > > to access the stack canary through %gs, but before
> > > __restore_processor_state has restored the previous value
> > > of %gs properly. LLVM appears to be inlining functions with stack
> > > protectors into functions compiled with -fno-stack-protector,
> > > which is likely a bug in LLVM's inliner that needs to be fixed.
> > >
> > > The LLVM bug is here: https://bugs.llvm.org/show_bug.cgi?id=47479
> > >
> > > Signed-off-by: Haitao Shan <[email protected]>
> > > Signed-off-by: Roman Kiryanov <[email protected]>
> >
> > Ok, google guys, pls make sure you Cc LKML too as this is where *all*
> > patches and discussions are archived. Adding it now to Cc.
>
> Roman, please use ./scripts/get_maintainer.pl (in the kernel tree) for that.
>
> >
> > > ---
> > > arch/x86/power/cpu.c | 10 ++++++++++
> > > 1 file changed, 10 insertions(+)
> > >
> > > diff --git a/arch/x86/power/cpu.c b/arch/x86/power/cpu.c
> > > index db1378c6ff26..e5677adb2d28 100644
> > > --- a/arch/x86/power/cpu.c
> > > +++ b/arch/x86/power/cpu.c
> > > @@ -274,6 +274,16 @@ static void notrace __restore_processor_state(struct saved_context *ctxt)
> > > /* Needed by apm.c */
> > > void notrace restore_processor_state(void)
> > > {
> > > +#ifdef __clang__
>
> Should be CONFIG_CC_IS_CLANG; is more canonical throughout the tree.
> Or if this is only a bug when doing builds with LTO, and LTO is not
> yet upstream, then maybe Sami should carry this in his series, at
> least until I can fix the bug in Clang. Or guard this with the
> CONFIG_LTO_CLANG config (not upstream yet; see Sami's series).
>
> > > + // The following code snippet is copied from __restore_processor_state.
> > > + // Its purpose is to prepare GS segment before the function is called.
> > > +#ifdef CONFIG_X86_64
> > > + wrmsrl(MSR_GS_BASE, saved_context.kernelmode_gs_base);
> > > +#else
> > > + loadsegment(fs, __KERNEL_PERCPU);
> > > + loadsegment(gs, __KERNEL_STACK_CANARY);
> > > +#endif
> > > +#endif
> >
> > Ok, so why is the kernel supposed to take yet another ugly workaround
> > because there's a bug in the compiler?
>
> This is exactly the same code from __restore_processor_state. If it's
> ugly, talk to the author of 7ee18d677989e. ;) All this patch is doing
> is moving this up a call frame (though now this is effectively being
> run twice).
>

Possibly dumb question: why does this fix anything? Won't
__restore_processor_state(), which is a static function with only one
caller, in turn get inlined into restore_processor_state(), so that
restore_processor_state() will also have stack protection enabled, and
the canary will be accessed before the MSR or segment register is
loaded?

Thanks.