allow gcc4 compilers to optimize unit-at-a-time - which results in gcc
having a wider scope when optimizing. This also results in smaller code
when optimizing for size. (gcc4 does not have the stack footprint
problem of gcc3 compilers.)
Signed-off-by: Ingo Molnar <[email protected]>
Signed-off-by: Arjan van de Ven <[email protected]>
----
arch/i386/Makefile | 6 +++---
1 files changed, 3 insertions(+), 3 deletions(-)
Index: linux/arch/i386/Makefile
===================================================================
--- linux.orig/arch/i386/Makefile
+++ linux/arch/i386/Makefile
@@ -42,9 +42,9 @@ include $(srctree)/arch/i386/Makefile.cp
GCC_VERSION := $(call cc-version)
cflags-$(CONFIG_REGPARM) += $(shell if [ $(GCC_VERSION) -ge 0300 ] ; then echo "-mregparm=3"; fi ;)
-# Disable unit-at-a-time mode, it makes gcc use a lot more stack
-# due to the lack of sharing of stacklots.
-CFLAGS += $(call cc-option,-fno-unit-at-a-time)
+# Disable unit-at-a-time mode on pre-gcc-4.0 compilers, it makes gcc use
+# a lot more stack due to the lack of sharing of stacklots:
+CFLAGS += $(shell if [ $(GCC_VERSION) -lt 0400 ] ; then echo "-fno-unit-at-a-time"; fi ;)
CFLAGS += $(cflags-y)
On Wed, Dec 28, 2005 at 12:47:01PM +0100, Ingo Molnar wrote:
> allow gcc4 compilers to optimize unit-at-a-time - which results in gcc
> having a wider scope when optimizing. This also results in smaller code
> when optimizing for size. (gcc4 does not have the stack footprint
> problem of gcc3 compilers.)
>
> Signed-off-by: Ingo Molnar <[email protected]>
> Signed-off-by: Arjan van de Ven <[email protected]>
> ----
>
> arch/i386/Makefile | 6 +++---
> 1 files changed, 3 insertions(+), 3 deletions(-)
>
> Index: linux/arch/i386/Makefile
> ===================================================================
> --- linux.orig/arch/i386/Makefile
> +++ linux/arch/i386/Makefile
> @@ -42,9 +42,9 @@ include $(srctree)/arch/i386/Makefile.cp
> GCC_VERSION := $(call cc-version)
> cflags-$(CONFIG_REGPARM) += $(shell if [ $(GCC_VERSION) -ge 0300 ] ; then echo "-mregparm=3"; fi ;)
>
> -# Disable unit-at-a-time mode, it makes gcc use a lot more stack
> -# due to the lack of sharing of stacklots.
> -CFLAGS += $(call cc-option,-fno-unit-at-a-time)
> +# Disable unit-at-a-time mode on pre-gcc-4.0 compilers, it makes gcc use
> +# a lot more stack due to the lack of sharing of stacklots:
> +CFLAGS += $(shell if [ $(GCC_VERSION) -lt 0400 ] ; then echo "-fno-unit-at-a-time"; fi ;)
-fno-unit-at-a-time option has been introduced in GCC 3.4 (and 3.3-hammer
branch). So unless the minimum supported GCC version to compile kernel is
3.4+, you need to replace
echo "-fno-unit-at-a-time"
with
$(call cc-option,-fno-unit-at-a-time)
.
Jakub
On Wed, Dec 28, 2005 at 07:04:35AM -0500, Jakub Jelinek wrote:
> > +# Disable unit-at-a-time mode on pre-gcc-4.0 compilers, it makes gcc use
> > +# a lot more stack due to the lack of sharing of stacklots:
> > +CFLAGS += $(shell if [ $(GCC_VERSION) -lt 0400 ] ; then echo "-fno-unit-at-a-time"; fi ;)
>
> -fno-unit-at-a-time option has been introduced in GCC 3.4 (and 3.3-hammer
> branch). So unless the minimum supported GCC version to compile kernel is
> 3.4+, you need to replace
> echo "-fno-unit-at-a-time"
> with
> $(call cc-option,-fno-unit-at-a-time)
The test "$(GCC_VERSION) -lt 0400" takes care of this.
Sam
On Wed, Dec 28, 2005 at 01:28:15PM +0100, Sam Ravnborg wrote:
> On Wed, Dec 28, 2005 at 07:04:35AM -0500, Jakub Jelinek wrote:
> > > +# Disable unit-at-a-time mode on pre-gcc-4.0 compilers, it makes gcc use
> > > +# a lot more stack due to the lack of sharing of stacklots:
> > > +CFLAGS += $(shell if [ $(GCC_VERSION) -lt 0400 ] ; then echo "-fno-unit-at-a-time"; fi ;)
> >
> > -fno-unit-at-a-time option has been introduced in GCC 3.4 (and 3.3-hammer
> > branch). So unless the minimum supported GCC version to compile kernel is
> > 3.4+, you need to replace
> > echo "-fno-unit-at-a-time"
> > with
> > $(call cc-option,-fno-unit-at-a-time)
> The test "$(GCC_VERSION) -lt 0400" takes care of this.
No.
-fno-unit-at-a-time should be used with GCCs that
a) support it
b) are older than GCC 4.0
The "$(GCC_VERSION) -lt 0400" test cares of b),
$(call cc-option,-fno-unit-at-a-time) cares of a).
Jakub
* Jakub Jelinek <[email protected]> wrote:
> > +CFLAGS += $(shell if [ $(GCC_VERSION) -lt 0400 ] ; then echo "-fno-unit-at-a-time"; fi ;)
>
> -fno-unit-at-a-time option has been introduced in GCC 3.4 (and 3.3-hammer
> branch). So unless the minimum supported GCC version to compile kernel is
> 3.4+, you need to replace
> echo "-fno-unit-at-a-time"
> with
> $(call cc-option,-fno-unit-at-a-time)
> .
indeed - updated patch below.
Ingo
Subject: allow gcc4 to optimize unit-at-a-time
allow gcc4 compilers to optimize unit-at-a-time - which results in gcc
having a wider scope when optimizing. This also results in smaller code
when optimizing for size. (gcc4 does not have the stack footprint
problem of gcc3 compilers.)
Signed-off-by: Ingo Molnar <[email protected]>
Signed-off-by: Arjan van de Ven <[email protected]>
----
arch/i386/Makefile | 6 +++---
1 files changed, 3 insertions(+), 3 deletions(-)
Index: linux/arch/i386/Makefile
===================================================================
--- linux.orig/arch/i386/Makefile
+++ linux/arch/i386/Makefile
@@ -42,9 +42,9 @@ include $(srctree)/arch/i386/Makefile.cp
GCC_VERSION := $(call cc-version)
cflags-$(CONFIG_REGPARM) += $(shell if [ $(GCC_VERSION) -ge 0300 ] ; then echo "-mregparm=3"; fi ;)
-# Disable unit-at-a-time mode, it makes gcc use a lot more stack
-# due to the lack of sharing of stacklots.
-CFLAGS += $(call cc-option,-fno-unit-at-a-time)
+# Disable unit-at-a-time mode on pre-gcc-4.0 compilers, it makes gcc use
+# a lot more stack due to the lack of sharing of stacklots:
+CFLAGS += $(shell if [ $(GCC_VERSION) -lt 0400 ] ; then $(call cc-option,-fno-unit-at-a-time); fi ;)
CFLAGS += $(cflags-y)
On Wed, Dec 28, 2005 at 08:04:35AM -0500, Jakub Jelinek wrote:
> No.
> -fno-unit-at-a-time should be used with GCCs that
> a) support it
> b) are older than GCC 4.0
>
> The "$(GCC_VERSION) -lt 0400" test cares of b),
> $(call cc-option,-fno-unit-at-a-time) cares of a).
There was a reason for disabling it unconditionally in first place.
That was due to unexpected huge stack usage if I understand correct.
Ingo's patch enebles unit-at-a-time only for gcc > 4.00 which should
have this issue fixed.
If the argument is that we suddenly shall enable unit-at-a-time for
gcc before 4.00 then we should visit the reasons why it originally was
disabled.
Sam
On Wed, Dec 28, 2005 at 01:47:04PM +0100, Sam Ravnborg wrote:
> On Wed, Dec 28, 2005 at 08:04:35AM -0500, Jakub Jelinek wrote:
> > No.
> > -fno-unit-at-a-time should be used with GCCs that
> > a) support it
> > b) are older than GCC 4.0
> >
> > The "$(GCC_VERSION) -lt 0400" test cares of b),
> > $(call cc-option,-fno-unit-at-a-time) cares of a).
>
> There was a reason for disabling it unconditionally in first place.
> That was due to unexpected huge stack usage if I understand correct.
> Ingo's patch enebles unit-at-a-time only for gcc > 4.00 which should
> have this issue fixed.
>
> If the argument is that we suddenly shall enable unit-at-a-time for
> gcc before 4.00 then we should visit the reasons why it originally was
> disabled.
Hi Jakub.
Reading your mail once more I understood it.
And you are right of course.
Sam - on his way to get more coffee...
Ingo Molnar <[email protected]> writes:
> allow gcc4 compilers to optimize unit-at-a-time - which results in gcc
> having a wider scope when optimizing. This also results in smaller code
> when optimizing for size. (gcc4 does not have the stack footprint
> problem of gcc3 compilers.)
I never had any trouble with stack footprint even with gcc 3.3 on x86-64
and unit-at-a-time and it was always enabled.
But one caveat: turning on unit-at-a-time makes objdump -S / make
foo/bar.lst with CONFIG_DEBUG_INFO essentially useless because objdump
cannot deal with functions being out of order in the object file. This
can be a big problem while analyzing oopses - essentially you have
to analyze the functions without source level information. And with
unit-at-a-time they become bigger so it's more difficult.
But I still think it's a good idea.
-Andi
On Wed, Dec 28, 2005 at 04:30:49PM +0100, Andi Kleen wrote:
> Ingo Molnar <[email protected]> writes:
>
> > allow gcc4 compilers to optimize unit-at-a-time - which results in gcc
> > having a wider scope when optimizing. This also results in smaller code
> > when optimizing for size. (gcc4 does not have the stack footprint
> > problem of gcc3 compilers.)
>
> I never had any trouble with stack footprint even with gcc 3.3 on x86-64
> and unit-at-a-time and it was always enabled.
The particular offenders I remember were in lib/inflate.c running over
4K well before 4K stacks were in mainline, so I fixed it well before
anyone else got to see it.
> But one caveat: turning on unit-at-a-time makes objdump -S / make
> foo/bar.lst with CONFIG_DEBUG_INFO essentially useless because objdump
> cannot deal with functions being out of order in the object file. This
> can be a big problem while analyzing oopses - essentially you have
> to analyze the functions without source level information. And with
> unit-at-a-time they become bigger so it's more difficult.
Yeah, and it also makes stuff like bloat-o-meter output go all to hell.
> But I still think it's a good idea.
Indeed.
--
Mathematics is the supreme nostalgia of our time.
* Andi Kleen <[email protected]> wrote:
> But one caveat: turning on unit-at-a-time makes objdump -S / make
> foo/bar.lst with CONFIG_DEBUG_INFO essentially useless because objdump
> cannot deal with functions being out of order in the object file. This
> can be a big problem while analyzing oopses - essentially you have to
> analyze the functions without source level information. And with
> unit-at-a-time they become bigger so it's more difficult.
>
> But I still think it's a good idea.
hm, i dont seem to have problems with DEBUG_INFO. I picked a random
address within the kernel:
c035766f T schedule_timeout
(gdb) list *0xc035768f
0xc035768f is in schedule_timeout (kernel/timer.c:1075).
1070 * should never happens anyway). You just have the printk()
1071 * that will tell you if something is gone wrong and where.
1072 */
1073 if (timeout < 0)
1074 {
1075 printk(KERN_ERR "schedule_timeout: wrong timeout "
1076 "value %lx from %p\n", timeout,
1077 __builtin_return_address(0));
1078 current->state = TASK_RUNNING;
1079 goto out;
(gdb)
or is it something else that breaks?
Ingo
Am Mi 28.12.2005 16:41 schrieb Ingo Molnar <[email protected]>:
>
> * Andi Kleen <[email protected]> wrote:
>
> > But one caveat: turning on unit-at-a-time makes objdump -S / make
> > foo/bar.lst with CONFIG_DEBUG_INFO essentially useless because
> > objdump
> > cannot deal with functions being out of order in the object file.
> > This
> > can be a big problem while analyzing oopses - essentially you have
> > to
> > analyze the functions without source level information. And with
> > unit-at-a-time they become bigger so it's more difficult.
> >
> > But I still think it's a good idea.
>
> hm, i dont seem to have problems with DEBUG_INFO. I picked a random
> address within the kernel:
>
> c035766f T schedule_timeout
>
> (gdb) list *0xc035768f
> 0xc035768f is in schedule_timeout (kernel/timer.c:1075).
> 1070 * should never happens anyway). You just have the printk()
> 1071 * that will tell you if something is gone wrong and where.
> 1072 */
> 1073 if (timeout < 0)
> 1074 {
> 1075 printk(KERN_ERR "schedule_timeout: wrong timeout "
> 1076 "value %lx from %p
", timeout,
> 1077 __builtin_return_address(0));
> 1078 current->state = TASK_RUNNING;
> 1079 goto out;
> (gdb)
>
> or is it something else that breaks?
It's objdump that breaks. Try objdump -S. gdb can deal with it, but you
can't generate
mixed C/assembly listings with it, so it's hard to match up the exact
lines.
(apparently it's possible through the gdb/mi interface, but I haven't
attempted
that yet)
-Andi