2008-02-22 00:30:52

by James Morris

[permalink] [raw]
Subject: Boot hang with stack protector on x86_64

I've been getting an immediate hang on boot since:

e06b8b98da071f7dd78fb7822991694288047df0 kbuild: allow -fstack-protector to take effect

This is on an x86_64 system (actually an Intel SDV -- so it might be
"special").

My .config has the following:

CONFIG_CC_STACKPROTECTOR=y
CONFIG_CC_STACKPROTECTOR_ALL=y

Let me know if you need more information.


Interestingly, git-bisect was unable to identify the changeset, and
reliably mis-identified this as the problem:

71c044752cdae89136862495f244d37073e2cf66 V4L/DVB (7080): zr364xx: add support for Pentax Optio 50

(which I don't even build).


- James
--
James Morris
<[email protected]>


2008-02-22 01:14:59

by Arjan van de Ven

[permalink] [raw]
Subject: Re: Boot hang with stack protector on x86_64

On Fri, 22 Feb 2008 11:29:37 +1100 (EST)
James Morris <[email protected]> wrote:

> I've been getting an immediate hang on boot since:
>
> e06b8b98da071f7dd78fb7822991694288047df0 kbuild: allow
> -fstack-protector to take effect
>
> This is on an x86_64 system (actually an Intel SDV -- so it might be
> "special").
>
> My .config has the following:
>
> CONFIG_CC_STACKPROTECTOR=y
> CONFIG_CC_STACKPROTECTOR_ALL=y
>
> Let me know if you need more information.
>
>
Ingo has a set of stackprotector fixes that should go to mainline ASAP.


--
If you want to reach me at my work email, use [email protected]
For development, discussion and tips for power savings,
visit http://www.lesswatts.org

2008-02-22 07:59:49

by Sam Ravnborg

[permalink] [raw]
Subject: Regression [Was: Boot hang with stack protector on x86_64]

Hi Linus.
On Fri, Feb 22, 2008 at 11:29:37AM +1100, James Morris wrote:
> I've been getting an immediate hang on boot since:
>
> e06b8b98da071f7dd78fb7822991694288047df0 kbuild: allow -fstack-protector to take effect
>
> This is on an x86_64 system (actually an Intel SDV -- so it might be
> "special").
This is a regression. Can you please revert this commit.
We can redo it wehn the -fstack-protector stuff are properly fixed in x86.

Sam

[Kept rest of the mail for reference]

>
> My .config has the following:
>
> CONFIG_CC_STACKPROTECTOR=y
> CONFIG_CC_STACKPROTECTOR_ALL=y
>
> Let me know if you need more information.
>
>
> Interestingly, git-bisect was unable to identify the changeset, and
> reliably mis-identified this as the problem:
>
> 71c044752cdae89136862495f244d37073e2cf66 V4L/DVB (7080): zr364xx: add support for Pentax Optio 50
>
> (which I don't even build).
>
>
> - James
> --
> James Morris
> <[email protected]>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/

2008-02-22 09:36:58

by Ingo Molnar

[permalink] [raw]
Subject: Re: Regression [Was: Boot hang with stack protector on x86_64]


* Sam Ravnborg <[email protected]> wrote:

> > This is on an x86_64 system (actually an Intel SDV -- so it might be
> > "special").
>
> This is a regression. Can you please revert this commit. We can redo
> it wehn the -fstack-protector stuff are properly fixed in x86.

No!

James, could you please check whether x86.git#testing:

http://people.redhat.com/mingo/x86.git/README

works fine for you? That has all the current stackprotector fixes. I
plan to send a separate pull request with just the stackprotector fixes
to Linus, they are looking good in testing so far.

Ingo

2008-02-22 10:35:31

by James Morris

[permalink] [raw]
Subject: Re: Regression [Was: Boot hang with stack protector on x86_64]

On Fri, 22 Feb 2008, Ingo Molnar wrote:

>
> * Sam Ravnborg <[email protected]> wrote:
>
> > > This is on an x86_64 system (actually an Intel SDV -- so it might be
> > > "special").
> >
> > This is a regression. Can you please revert this commit. We can redo
> > it wehn the -fstack-protector stuff are properly fixed in x86.
>
> No!
>
> James, could you please check whether x86.git#testing:
>
> http://people.redhat.com/mingo/x86.git/README
>
> works fine for you? That has all the current stackprotector fixes. I
> plan to send a separate pull request with just the stackprotector fixes
> to Linus, they are looking good in testing so far.

Nope, same problem.

(I followed your instructions in the readme exactly).


- James
--
James Morris
<[email protected]>

2008-02-22 12:18:27

by Ingo Molnar

[permalink] [raw]
Subject: Re: Regression [Was: Boot hang with stack protector on x86_64]


* James Morris <[email protected]> wrote:

> > works fine for you? That has all the current stackprotector fixes. I
> > plan to send a separate pull request with just the stackprotector
> > fixes to Linus, they are looking good in testing so far.
>
> Nope, same problem.
>
> (I followed your instructions in the readme exactly).

stupid double check, does "git-log | grep stackpro" give you the fixes:

This patch adds a simple self-test capability to the stackprotector
x86: unify stackprotector features
streamline the stackprotector features under a single option
x86: stackprotector: mix TSC to the boot canary
x86: fix the stackprotector canary of the boot CPU
stackprotector: add boot_init_stack_canary()
stackprotector: include files

?

Please send me your full .config and the gcc version you used for
building the failing kernel.

Ingo

2008-02-22 13:03:58

by James Morris

[permalink] [raw]
Subject: Re: Regression [Was: Boot hang with stack protector on x86_64]

On Fri, 22 Feb 2008, Ingo Molnar wrote:

>
> * James Morris <[email protected]> wrote:
>
> > > works fine for you? That has all the current stackprotector fixes. I
> > > plan to send a separate pull request with just the stackprotector
> > > fixes to Linus, they are looking good in testing so far.
> >
> > Nope, same problem.
> >
> > (I followed your instructions in the readme exactly).
>
> stupid double check, does "git-log | grep stackpro" give you the fixes:
>
> This patch adds a simple self-test capability to the stackprotector
> x86: unify stackprotector features
> streamline the stackprotector features under a single option
> x86: stackprotector: mix TSC to the boot canary
> x86: fix the stackprotector canary of the boot CPU
> stackprotector: add boot_init_stack_canary()
> stackprotector: include files
>
> ?

Yes.

>
> Please send me your full .config and the gcc version you used for
> building the failing kernel.

config attached.

$ gcc -v
Using built-in specs.
Target: x86_64-redhat-linux
Configured with: ../configure --prefix=/usr --mandir=/usr/share/man
--infodir=/usr/share/info --enable-shared --enable-threads=posix
--enable-checking=release --with-system-zlib --enable-__cxa_atexit
--disable-libunwind-exceptions
--enable-languages=c,c++,objc,obj-c++,java,fortran,ada
--enable-java-awt=gtk --disable-dssi --enable-plugin
--with-java-home=/usr/lib/jvm/java-1.5.0-gcj-1.5.0.0/jre
--enable-libgcj-multifile --enable-java-maintainer-mode
--with-ecj-jar=/usr/share/java/eclipse-ecj.jar --with-cpu=generic
--host=x86_64-redhat-linux
Thread model: posix
gcc version 4.1.2 20071124 (Red Hat 4.1.2-36)


--

James Morris
<[email protected]>


Attachments:
config.txt (29.83 kB)

2008-02-22 13:28:20

by Ingo Molnar

[permalink] [raw]
Subject: Re: Regression [Was: Boot hang with stack protector on x86_64]


* James Morris <[email protected]> wrote:

> > Please send me your full .config and the gcc version you used for
> > building the failing kernel.
>
> config attached.

thanks, i'll try that.

> gcc version 4.1.2 20071124 (Red Hat 4.1.2-36)

ok, i have a slightly older one:

gcc version 4.1.2 20070925 (Red Hat 4.1.2-33)

i'll try to upgrade to your version to make sure we see the same
regression.

Ingo

2008-02-22 15:44:26

by Linus Torvalds

[permalink] [raw]
Subject: Re: Regression [Was: Boot hang with stack protector on x86_64]



On Fri, 22 Feb 2008, Sam Ravnborg wrote:
>
> This is a regression. Can you please revert this commit.

Not really. The thing is, CONFIG_CC_STACKPROTECTOR has never done anything
at all, now it does, and it shows that it never worked.

So the commit that made it do something shouldn't be reverted, but
CONFIG_CC_STACKPROTECTOR should be marked BROKEN, because it obviously is
broken right now.

But keeping the config option, and just not making it do anything is
misleading and wrong.

So just something like this? To make sure normal people don't enable it..

Linus

---
arch/x86/Kconfig | 2 +-
1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 3be2305..4a88cf7 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -1054,7 +1054,7 @@ config SECCOMP

config CC_STACKPROTECTOR
bool "Enable -fstack-protector buffer overflow detection (EXPERIMENTAL)"
- depends on X86_64 && EXPERIMENTAL
+ depends on X86_64 && EXPERIMENTAL && BROKEN
help
This option turns on the -fstack-protector GCC feature. This
feature puts, at the beginning of critical functions, a canary

2008-02-22 15:59:55

by Arjan van de Ven

[permalink] [raw]
Subject: Re: Regression [Was: Boot hang with stack protector on x86_64]

On Fri, 22 Feb 2008 07:43:08 -0800 (PST)
Linus Torvalds <[email protected]> wrote:

>
>
> On Fri, 22 Feb 2008, Sam Ravnborg wrote:
> >
> > This is a regression. Can you please revert this commit.
>
> Not really. The thing is, CONFIG_CC_STACKPROTECTOR has never done
> anything at all, now it does, and it shows that it never worked.
>
> So the commit that made it do something shouldn't be reverted, but
> CONFIG_CC_STACKPROTECTOR should be marked BROKEN, because it
> obviously is broken right now.
>
> But keeping the config option, and just not making it do anything is
> misleading and wrong.
>
> So just something like this? To make sure normal people don't enable
> it..


looks fair; once the patches to make it work are merged the B0RKEN can
go away again easliy

2008-02-22 16:12:58

by Sam Ravnborg

[permalink] [raw]
Subject: Re: Regression [Was: Boot hang with stack protector on x86_64]

On Fri, Feb 22, 2008 at 07:43:08AM -0800, Linus Torvalds wrote:
>
>
> On Fri, 22 Feb 2008, Sam Ravnborg wrote:
> >
> > This is a regression. Can you please revert this commit.
>
> Not really. The thing is, CONFIG_CC_STACKPROTECTOR has never done anything
> at all, now it does, and it shows that it never worked.
And this made a machine non-bootable that could boot before.

>
> So the commit that made it do something shouldn't be reverted, but
> CONFIG_CC_STACKPROTECTOR should be marked BROKEN, because it obviously is
> broken right now.
>
> But keeping the config option, and just not making it do anything is
> misleading and wrong.
>
> So just something like this? To make sure normal people don't enable it..
Whatever is fine with me - I hope Ingo & Co fixes the root cause soon so
we can get the right fix in.

Sam

2008-02-23 06:19:52

by Ingo Molnar

[permalink] [raw]
Subject: Re: Regression [Was: Boot hang with stack protector on x86_64]


James,

could you try the fix below ontop of x86.git#testing, does it solve your
boot hang?

Ingo

--------------->
Subject: x86: stackprotector fix: do not zap %gs
From: Ingo Molnar <[email protected]>
Date: Sat Feb 23 07:06:55 CET 2008

pda_init() puts 0 into %gs - that's wrong because any %gs access will
fault from now on and we already have a dummy PDA set up that can be
accessed just fine.

This normally does not matter because almost nothing accesses %gs this
early ... but the stackprotector now does to read the canary ...

Signed-off-by: Ingo Molnar <[email protected]>
---
arch/x86/kernel/setup64.c | 2 --
1 file changed, 2 deletions(-)

Index: linux-x86.q/arch/x86/kernel/setup64.c
===================================================================
--- linux-x86.q.orig/arch/x86/kernel/setup64.c
+++ linux-x86.q/arch/x86/kernel/setup64.c
@@ -165,8 +165,6 @@ void pda_init(int cpu)
{
struct x8664_pda *pda = cpu_pda(cpu);

- /* Setup up data that may be needed in __get_free_pages early */
- asm volatile("movl %0,%%fs ; movl %0,%%gs" :: "r" (0));
/* Memory clobbers used to order PDA accessed */
mb();
wrmsrl(MSR_GS_BASE, pda);

2008-02-23 07:38:25

by Ingo Molnar

[permalink] [raw]
Subject: Re: Regression [Was: Boot hang with stack protector on x86_64]


* Ingo Molnar <[email protected]> wrote:

> could you try the fix below ontop of x86.git#testing, does it solve
> your boot hang?

find below another fix that is somewhat better as it does not affect the
native kernel and !PARAVIRT.

btw., this also explains why this bug wasnt reported sooner against
x86.git#testing: people done normally use PARAVIRT on 64-bit yet.
(there is no 64-bit host support)

Ingo

---------------->
Subject: x86: stackprotector & PARAVIRT fix
From: Ingo Molnar <[email protected]>
Date: Sat Feb 23 07:06:55 CET 2008

on paravirt enabled 64-bit kernels the paravirt ops do function
calls themselves - which is bad with the stackprotector - for
example pda_init() loads 0 into %gs and then does MSR_GS_BASE
write (which modifies gs.base) - but that MSR write is a function
call on paravirt, which with stackprotector tries to read the
stack canary from the PDA ... crashing the bootup.

the solution was suggested by Arjan van de Ven: to exclude paravirt.c
from stackprotector, too many lowlevel functionality is in it. It's
not like we'll have paravirt functions with character arrays on
their stack anyway...

Signed-off-by: Arjan van de Ven <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
---
arch/x86/kernel/Makefile | 1 +
1 file changed, 1 insertion(+)

Index: linux-x86.q/arch/x86/kernel/Makefile
===================================================================
--- linux-x86.q.orig/arch/x86/kernel/Makefile
+++ linux-x86.q/arch/x86/kernel/Makefile
@@ -15,6 +15,7 @@ nostackp := $(call cc-option, -fno-stack
CFLAGS_vsyscall_64.o := $(PROFILING) -g0 $(nostackp)
CFLAGS_hpet.o := $(nostackp)
CFLAGS_tsc_64.o := $(nostackp)
+CFLAGS_paravirt.o := $(nostackp)

obj-y := process_$(BITS).o signal_$(BITS).o entry_$(BITS).o
obj-y += traps_$(BITS).o irq_$(BITS).o

2008-02-24 22:55:05

by James Morris

[permalink] [raw]
Subject: Re: Regression [Was: Boot hang with stack protector on x86_64]

On Sat, 23 Feb 2008, Ingo Molnar wrote:

>
> * Ingo Molnar <[email protected]> wrote:
>
> > could you try the fix below ontop of x86.git#testing, does it solve
> > your boot hang?
>
> find below another fix that is somewhat better as it does not affect the
> native kernel and !PARAVIRT.

This works.

>
> btw., this also explains why this bug wasnt reported sooner against
> x86.git#testing: people done normally use PARAVIRT on 64-bit yet.
> (there is no 64-bit host support)
>
> Ingo
>
> ---------------->
> Subject: x86: stackprotector & PARAVIRT fix
> From: Ingo Molnar <[email protected]>
> Date: Sat Feb 23 07:06:55 CET 2008
>
> on paravirt enabled 64-bit kernels the paravirt ops do function
> calls themselves - which is bad with the stackprotector - for
> example pda_init() loads 0 into %gs and then does MSR_GS_BASE
> write (which modifies gs.base) - but that MSR write is a function
> call on paravirt, which with stackprotector tries to read the
> stack canary from the PDA ... crashing the bootup.
>
> the solution was suggested by Arjan van de Ven: to exclude paravirt.c
> from stackprotector, too many lowlevel functionality is in it. It's
> not like we'll have paravirt functions with character arrays on
> their stack anyway...
>
> Signed-off-by: Arjan van de Ven <[email protected]>
> Signed-off-by: Ingo Molnar <[email protected]>
> ---
> arch/x86/kernel/Makefile | 1 +
> 1 file changed, 1 insertion(+)
>
> Index: linux-x86.q/arch/x86/kernel/Makefile
> ===================================================================
> --- linux-x86.q.orig/arch/x86/kernel/Makefile
> +++ linux-x86.q/arch/x86/kernel/Makefile
> @@ -15,6 +15,7 @@ nostackp := $(call cc-option, -fno-stack
> CFLAGS_vsyscall_64.o := $(PROFILING) -g0 $(nostackp)
> CFLAGS_hpet.o := $(nostackp)
> CFLAGS_tsc_64.o := $(nostackp)
> +CFLAGS_paravirt.o := $(nostackp)
>
> obj-y := process_$(BITS).o signal_$(BITS).o entry_$(BITS).o
> obj-y += traps_$(BITS).o irq_$(BITS).o
>

--
James Morris
<[email protected]>

2008-02-25 08:23:49

by Ingo Molnar

[permalink] [raw]
Subject: Re: Regression [Was: Boot hang with stack protector on x86_64]


* James Morris <[email protected]> wrote:

> > > could you try the fix below ontop of x86.git#testing, does it
> > > solve your boot hang?
> >
> > find below another fix that is somewhat better as it does not affect
> > the native kernel and !PARAVIRT.
>
> This works.

thanks. We'll delay the stackprotector fixes for v2.6.26 (it's been
broken for too long), but if you want to have it you can pick it up from
x86.git.

Ingo