2003-03-06 11:11:43

by Andrew Morton

[permalink] [raw]
Subject: [patch] work around gcc-3.x inlining bugs


This patch:

--- 25/include/linux/compiler.h~gcc3-inline-fix 2003-03-06 03:02:43.000000000 -0800
+++ 25-akpm/include/linux/compiler.h 2003-03-06 03:11:42.000000000 -0800
@@ -1,6 +1,11 @@
#ifndef __LINUX_COMPILER_H
#define __LINUX_COMPILER_H

+#if __GNUC__ >= 3
+#define inline __inline__ __attribute__((always_inline))
+#define __inline__ __inline__ __attribute__((always_inline))
+#endif
+
/* Somewhere in the middle of the GCC 2.96 development cycle, we implemented
a mechanism by which the user can annotate likely branch directions and
expect the blocks to be reordered appropriately. Define __builtin_expect


shrinks my 3.2.1-compiled kernel text by about 64 kbytes:

text data bss dec hex filename
3316138 574844 726816 4617798 467646 vmlinux-before
3249255 555436 727204 4531895 4526b7 vmlinux-after

mnm:/tmp> nm vmlinux-before|grep __constant_c_and_count_memset | wc
233 699 9553
mnm:/tmp> nm vmlinux-after|grep __constant_c_and_count_memset | wc
13 39 533

Can anyone see a problem with it?


2003-03-06 11:42:17

by Andi Kleen

[permalink] [raw]
Subject: Re: [patch] work around gcc-3.x inlining bugs

Andrew Morton <[email protected]> writes:

> This patch:
[...]

I submitted a similar patch (using -include) it to Linus some time ago.
It's even required to work around gcc 3.3 inlining bugs.
Unfortunately he didn't like it and prefered __force_inline
to be added to the places that really rely on inline.

I experimented with -finline-limit=<huge number> to get it to obey inline,
but that doesn't fully work. The only way to get it to work in 3.2 and 3.3
is to specify various long and weird --param arguments. In 3.0 the only
way is to change the values in the compiler source and recompile.

So doing it with always_inline is much less ugly and also less compiler
version dependent.

I always add them currently by hand to linux/brlock.h to get 2.5
to compile with gcc 3.3 on x86-64.

I think it is the right thing to do. In kernel land when we say inline
we mean inline. Don't expect the compiler to second guess that,
especially since it doesn't seem to be very good at that.

There was also talk on the gcc mailing list to add an
-fobey-inline switch, but nothing got out of it. And it would be 3.3+
only anwyays.

I didn't see a 64K shrink however on x86-64, but I guess that's
because it has a much cleaner uaccess.h ;) For i386 it is a nice bonus.

-Andi

2003-03-06 15:47:17

by Jeff Garzik

[permalink] [raw]
Subject: Re: [patch] work around gcc-3.x inlining bugs

Andrew Morton wrote:
> This patch:
>
> --- 25/include/linux/compiler.h~gcc3-inline-fix 2003-03-06 03:02:43.000000000 -0800
> +++ 25-akpm/include/linux/compiler.h 2003-03-06 03:11:42.000000000 -0800
> @@ -1,6 +1,11 @@
> #ifndef __LINUX_COMPILER_H
> #define __LINUX_COMPILER_H
>
> +#if __GNUC__ >= 3
> +#define inline __inline__ __attribute__((always_inline))
> +#define __inline__ __inline__ __attribute__((always_inline))
> +#endif


I think there is also either __inline or inline__?

2003-03-06 20:16:08

by H. Peter Anvin

[permalink] [raw]
Subject: Re: [patch] work around gcc-3.x inlining bugs

Followup to: <[email protected]>
By author: Andi Kleen <[email protected]>
In newsgroup: linux.dev.kernel
>
> I submitted a similar patch (using -include) it to Linus some time ago.
> It's even required to work around gcc 3.3 inlining bugs.
> Unfortunately he didn't like it and prefered __force_inline
> to be added to the places that really rely on inline.
>

I would like to suggest that always_inline is defined as
"extern __inline__" plus whatever the particular compiler needs, and
that plain inline is redefined as "static __inline__" or whatever.

-hpa
--
<[email protected]> at work, <[email protected]> in private!
"Unix gives you enough rope to shoot yourself in the foot."
Architectures needed: ia64 m68k mips64 ppc ppc64 s390 s390x sh v850 x86-64

2003-03-06 20:30:54

by Daniel Egger

[permalink] [raw]
Subject: Re: [patch] work around gcc-3.x inlining bugs

Am Don, 2003-03-06 um 12.22 schrieb Andrew Morton:

> shrinks my 3.2.1-compiled kernel text by about 64 kbytes:
>
> text data bss dec hex filename
> 3316138 574844 726816 4617798 467646 vmlinux-before
> 3249255 555436 727204 4531895 4526b7 vmlinux-after
>
> mnm:/tmp> nm vmlinux-before|grep __constant_c_and_count_memset | wc
> 233 699 9553
> mnm:/tmp> nm vmlinux-after|grep __constant_c_and_count_memset | wc
> 13 39 533

> Can anyone see a problem with it?

Looks good to me...

text data bss dec hex filename
1842635 167450 140292 2150377 20cfe9 vmlinux
1867787 167450 140292 2175529 213229 vmlinux.old

I hope this doesn't trigger any latent compiler bugs but in this
case we should beat the gcc people to it...

--
Servus,
Daniel


Attachments:
signature.asc (189.00 B)
Dies ist ein digital signierter Nachrichtenteil

2003-03-07 00:16:05

by Kurt Garloff

[permalink] [raw]
Subject: Re: [patch] work around gcc-3.x inlining bugs

Hi Andi,

On Thu, Mar 06, 2003 at 12:52:48PM +0100, Andi Kleen wrote:
> Andrew Morton <[email protected]> writes:
>
> > This patch:
> [...]
>
> I experimented with -finline-limit=<huge number> to get it to obey inline,
> but that doesn't fully work. The only way to get it to work in 3.2 and 3.3
> is to specify various long and weird --param arguments. In 3.0 the only
> way is to change the values in the compiler source and recompile.

--param max-inline-insns-single
The fact that this parameter does not get initialized by using
-finline-limit is a bug. A fix for this has already gone into CVS HEAD
(3.4) and is pending for 3.3.

[...?]
> I think it is the right thing to do. In kernel land when we say inline
> we mean inline. Don't expect the compiler to second guess that,
> especially since it doesn't seem to be very good at that.

The compiler solely judges by the size of the function to be inlined,
no magic involved.
There are two reasons, why this may not be what you expect:
- The allowable size can get smaller if in the function from that
you're calling and the one above and ... were already inlined.
At some moment this recursive inlining has to be stopped to
not results in excessive compile time resource requirements
- When calculating the size of a function, the compiler counts
tree tokens, each estimated to yield 10 RTL instructions.
It does unfortunately not take into account code that will
be eliminated by optimization (constant propagation, dead
code elimination, ...) nor does it have a very good idea
about the size of inline assembly.

Regards,
--
Kurt Garloff <[email protected]> Eindhoven, NL
GPG key: See mail header, key servers SuSE Labs
SuSE Linux AG, Nuernberg, DE SCSI, Security


Attachments:
(No filename) (1.77 kB)
(No filename) (198.00 B)
Download all attachments

2003-03-09 22:31:27

by Kurt Garloff

[permalink] [raw]
Subject: Re: [patch] work around gcc-3.x inlining bugs

Hi,

On Thu, Mar 06, 2003 at 10:28:45PM +0100, Kurt Garloff wrote:
> The fact that this parameter does not get initialized by using
> -finline-limit is a bug. A fix for this has already gone into CVS HEAD
> (3.4) and is pending for 3.3.

Patch has been applied to 3.3 as well now.
Thanks to Geoff Keating and Dale Johannesen.

Regards,
--
Kurt Garloff <[email protected]> Eindhoven, NL
GPG key: See mail header, key servers SuSE Labs (Head)
SuSE Linux AG, Nuernberg, DE SCSI, Security


Attachments:
(No filename) (553.00 B)
(No filename) (198.00 B)
Download all attachments