2007-11-11 06:49:39

by Adrian Bunk

[permalink] [raw]
Subject: [RFC: 2.6 patch] add -fno-tree-scev-cprop to KBUILD_CFLAGS

The gcc from svn that will become gcc 4.3 generates libgcc calls in
cases like the following (on 32bit architectures):

<-- snip -->

static inline void timespec_add_ns(struct timespec *a, u64 ns)
{
...
while(ns >= NSEC_PER_SEC) {
ns -= NSEC_PER_SEC;
a->tv_sec++;
}
...

<-- snip -->

It can make sense to emit assembler code doing division for such C code -
that doesn't seem to be something that would generally be wrong.

But the kernel does (at least on some architectures) not link with
libgcc or ship other code providing the required libgcc functions.

Richard Guenther suggested in gcc bug #32044 to use -fno-tree-scev-cprop
(new option in gcc 4.3) as a workaround and I confirmed that it fixes
the compilation.

Signed-off-by: Adrian Bunk <[email protected]>

---
f2357fcb8addd855f1be6ac9fdf4ef3c3ab8256d
diff --git a/Makefile b/Makefile
index e28dde8..9d8a831 100644
--- a/Makefile
+++ b/Makefile
@@ -527,6 +527,9 @@ KBUILD_CFLAGS += $(call cc-option,-Wdeclaration-after-statement,)
# disable pointer signed / unsigned warnings in gcc 4.0
KBUILD_CFLAGS += $(call cc-option,-Wno-pointer-sign,)

+# workaround to avoid gcc 4.3 emitting libgcc calls (see gcc bug #32044)
+KBUILD_CFLAGS += $(call cc-option,-fno-tree-scev-cprop,)
+
# Add user supplied CPPFLAGS, AFLAGS and CFLAGS as the last assignments
# But warn user when we do so
warn-assign = \


2007-11-11 07:34:50

by Paul Mundt

[permalink] [raw]
Subject: Re: [RFC: 2.6 patch] add -fno-tree-scev-cprop to KBUILD_CFLAGS

On Sun, Nov 11, 2007 at 07:48:29AM +0100, Adrian Bunk wrote:
> But the kernel does (at least on some architectures) not link with
> libgcc or ship other code providing the required libgcc functions.
>
> Richard Guenther suggested in gcc bug #32044 to use -fno-tree-scev-cprop
> (new option in gcc 4.3) as a workaround and I confirmed that it fixes
> the compilation.
>
And some architectures do link it in, so the call to libgcc code doesn't
necessarily matter. Perhaps the architectures that don't link in libgcc
can turn this flag on conditionally, it shouldn't be forced on the
architectures that do link it in.

2007-11-12 06:15:14

by Adrian Bunk

[permalink] [raw]
Subject: the kernel, gcc and libgcc

[ linux-arch added to the Cc,
a copy of my original email is at the bottom ]

On Sun, Nov 11, 2007 at 04:34:22PM +0900, Paul Mundt wrote:
> On Sun, Nov 11, 2007 at 07:48:29AM +0100, Adrian Bunk wrote:
> > But the kernel does (at least on some architectures) not link with
> > libgcc or ship other code providing the required libgcc functions.
> >
> > Richard Guenther suggested in gcc bug #32044 to use -fno-tree-scev-cprop
> > (new option in gcc 4.3) as a workaround and I confirmed that it fixes
> > the compilation.
> >
> And some architectures do link it in, so the call to libgcc code doesn't
> necessarily matter. Perhaps the architectures that don't link in libgcc
> can turn this flag on conditionally, it shouldn't be forced on the
> architectures that do link it in.

We seem to have 3 different kinds of architectures:
- does not provide libgcc at all
- provides part of libgcc itself
- links with libgcc

And we also _might_ get away without requiring -fno-tree-scev-cprop on
some of the 64bit architectures that do not provide or link with libgcc.

This all is horrible fragile, with the root of the problem being a
disagreement between what gcc expects an environment to provide and what
the kernel provides.

After all, the real question is not whether we are able to limit the
workaround of disabling a minor optimization to fewer architectures,
but to get this fixed properly.

Ideally, we solve the latter making my patch not required on any
architecture.

cu
Adrian



<-- snip -->


The gcc from svn that will become gcc 4.3 generates libgcc calls in
cases like the following (on 32bit architectures):

<-- snip -->

static inline void timespec_add_ns(struct timespec *a, u64 ns)
{
...
while(ns >= NSEC_PER_SEC) {
ns -= NSEC_PER_SEC;
a->tv_sec++;
}
...

<-- snip -->

It can make sense to emit assembler code doing division for such C code -
that doesn't seem to be something that would generally be wrong.

But the kernel does (at least on some architectures) not link with
libgcc or ship other code providing the required libgcc functions.

Richard Guenther suggested in gcc bug #32044 to use -fno-tree-scev-cprop
(new option in gcc 4.3) as a workaround and I confirmed that it fixes
the compilation.

Signed-off-by: Adrian Bunk <[email protected]>

---
f2357fcb8addd855f1be6ac9fdf4ef3c3ab8256d
diff --git a/Makefile b/Makefile
index e28dde8..9d8a831 100644
--- a/Makefile
+++ b/Makefile
@@ -527,6 +527,9 @@ KBUILD_CFLAGS += $(call cc-option,-Wdeclaration-after-statement,)
# disable pointer signed / unsigned warnings in gcc 4.0
KBUILD_CFLAGS += $(call cc-option,-Wno-pointer-sign,)

+# workaround to avoid gcc 4.3 emitting libgcc calls (see gcc bug #32044)
+KBUILD_CFLAGS += $(call cc-option,-fno-tree-scev-cprop,)
+
# Add user supplied CPPFLAGS, AFLAGS and CFLAGS as the last assignments
# But warn user when we do so
warn-assign = \

2007-11-12 16:40:55

by Adrian Bunk

[permalink] [raw]
Subject: Re: [RFC: 2.6 patch] add -fno-tree-scev-cprop to KBUILD_CFLAGS

On Mon, Nov 12, 2007 at 05:24:18PM +0100, Bernd Schmidt wrote:
> Adrian Bunk wrote:
> > The gcc from svn that will become gcc 4.3 generates libgcc calls in
> > cases like the following (on 32bit architectures):
> >
> > <-- snip -->
> >
> > static inline void timespec_add_ns(struct timespec *a, u64 ns)
> > {
> > ...
> > while(ns >= NSEC_PER_SEC) {
> > ns -= NSEC_PER_SEC;
> > a->tv_sec++;
> > }
> > ...
> >
> > <-- snip -->
> >
> > It can make sense to emit assembler code doing division for such C code -
> > that doesn't seem to be something that would generally be wrong.
>
> It can be a pretty huge performance regression, so gcc ought to be fixed.

What is better depends on the values of ns and NSEC_PER_SEC.

It can be a performance regression, but there are also cases where it
can improve performance. If gcc produces lower performance code that
would be a bug in gcc that should be reported, but using a division is
not generally wrong.

A more clearer example might be:

<-- snip -->

void foo(u64 ns)
{
if (ns < 10000)
return;

while(ns >= 3) {
ns -= 3;
#ifdef DEBUG
bar(ns);
#endif
}
}

<-- snip -->

With DEBUG not defined you can hardly argue gcc should be fixed to not
use a division for performance reasons.

> Bernd

cu
Adrian

--

"Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
"Only a promise," Lao Er said.
Pearl S. Buck - Dragon Seed

2007-11-12 16:48:48

by Bernd Schmidt

[permalink] [raw]
Subject: Re: [RFC: 2.6 patch] add -fno-tree-scev-cprop to KBUILD_CFLAGS

Adrian Bunk wrote:
> The gcc from svn that will become gcc 4.3 generates libgcc calls in
> cases like the following (on 32bit architectures):
>
> <-- snip -->
>
> static inline void timespec_add_ns(struct timespec *a, u64 ns)
> {
> ...
> while(ns >= NSEC_PER_SEC) {
> ns -= NSEC_PER_SEC;
> a->tv_sec++;
> }
> ...
>
> <-- snip -->
>
> It can make sense to emit assembler code doing division for such C code -
> that doesn't seem to be something that would generally be wrong.

It can be a pretty huge performance regression, so gcc ought to be fixed.


Bernd
--
This footer brought to you by insane German lawmakers.
Analog Devices GmbH Wilhelm-Wagenfeld-Str. 6 80807 Muenchen
Sitz der Gesellschaft Muenchen, Registergericht Muenchen HRB 40368
Geschaeftsfuehrer Thomas Wessel, William A. Martin, Margaret Seif

2007-11-12 16:59:21

by Jakub Jelinek

[permalink] [raw]
Subject: Re: [RFC: 2.6 patch] add -fno-tree-scev-cprop to KBUILD_CFLAGS

On Sun, Nov 11, 2007 at 07:48:29AM +0100, Adrian Bunk wrote:
> The gcc from svn that will become gcc 4.3 generates libgcc calls in
> cases like the following (on 32bit architectures):
>
> <-- snip -->
>
> static inline void timespec_add_ns(struct timespec *a, u64 ns)
> {
> ...
> while(ns >= NSEC_PER_SEC) {
> ns -= NSEC_PER_SEC;
> a->tv_sec++;
> }
> ...
>
> <-- snip -->

Blindly using -fno-tree-scev-cprop just to get rid of one case where
this turns out to be a pessimization when kernel knows ns is usually very
small is IMHO a wrong thing, you'd lose many cases where this optimization
can actually improve performance. Instead, for this exact case just
add an optimization barrier to avoid gcc doing this.
Adding asm ("" : "=r" (ns) : "0" (ns)); (or hide it in some macro) into the
loop will do the job just fine.

Jakub

2007-11-12 22:08:23

by Bernd Schmidt

[permalink] [raw]
Subject: Re: [RFC: 2.6 patch] add -fno-tree-scev-cprop to KBUILD_CFLAGS

Adrian Bunk wrote:
> It can be a performance regression, but there are also cases where it
> can improve performance. If gcc produces lower performance code that
> would be a bug in gcc that should be reported, but using a division is
> not generally wrong.
>
> A more clearer example might be:
>
> <-- snip -->
>
> void foo(u64 ns)
> {
> if (ns < 10000)
> return;
>
> while(ns >= 3) {
> ns -= 3;
> #ifdef DEBUG
> bar(ns);
> #endif
> }
> }
>
> <-- snip -->
>
> With DEBUG not defined you can hardly argue gcc should be fixed to not
> use a division for performance reasons.

Absent any clear information about the possible values of ns, IMO this
is a case where the compiler should just assume that the programmer
knows best whether to use a loop or a division. Principle of least
surprise, and all that...


Bernd
--
This footer brought to you by insane German lawmakers.
Analog Devices GmbH Wilhelm-Wagenfeld-Str. 6 80807 Muenchen
Sitz der Gesellschaft Muenchen, Registergericht Muenchen HRB 40368
Geschaeftsfuehrer Thomas Wessel, William A. Martin, Margaret Seif

2007-11-13 06:04:53

by Adrian Bunk

[permalink] [raw]
Subject: Re: [RFC: 2.6 patch] add -fno-tree-scev-cprop to KBUILD_CFLAGS

On Mon, Nov 12, 2007 at 11:07:55PM +0100, Bernd Schmidt wrote:
> Adrian Bunk wrote:
> > It can be a performance regression, but there are also cases where it
> > can improve performance. If gcc produces lower performance code that
> > would be a bug in gcc that should be reported, but using a division is
> > not generally wrong.
> >
> > A more clearer example might be:
> >
> > <-- snip -->
> >
> > void foo(u64 ns)
> > {
> > if (ns < 10000)
> > return;
> >
> > while(ns >= 3) {
> > ns -= 3;
> > #ifdef DEBUG
> > bar(ns);
> > #endif
> > }
> > }
> >
> > <-- snip -->
> >
> > With DEBUG not defined you can hardly argue gcc should be fixed to not
> > use a division for performance reasons.
>
> Absent any clear information about the possible values of ns,
>...

In this example, there is clear information about the possible values
of ns...

> Bernd

cu
Adrian

--

"Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
"Only a promise," Lao Er said.
Pearl S. Buck - Dragon Seed

2007-11-13 11:40:38

by Adrian Bunk

[permalink] [raw]
Subject: Re: [RFC: 2.6 patch] add -fno-tree-scev-cprop to KBUILD_CFLAGS

On Mon, Nov 12, 2007 at 11:58:43AM -0500, Jakub Jelinek wrote:
> On Sun, Nov 11, 2007 at 07:48:29AM +0100, Adrian Bunk wrote:
> > The gcc from svn that will become gcc 4.3 generates libgcc calls in
> > cases like the following (on 32bit architectures):
> >
> > <-- snip -->
> >
> > static inline void timespec_add_ns(struct timespec *a, u64 ns)
> > {
> > ...
> > while(ns >= NSEC_PER_SEC) {
> > ns -= NSEC_PER_SEC;
> > a->tv_sec++;
> > }
> > ...
> >
> > <-- snip -->
>
> Blindly using -fno-tree-scev-cprop just to get rid of one case where
> this turns out to be a pessimization when kernel knows ns is usually very
> small is IMHO a wrong thing, you'd lose many cases where this optimization
> can actually improve performance.

It's not about performance, it's about a build error.

> Instead, for this exact case just
> add an optimization barrier to avoid gcc doing this.
> Adding asm ("" : "=r" (ns) : "0" (ns)); (or hide it in some macro) into the
> loop will do the job just fine.

The problem is that this is very fragile - imagine what might happen
when a frequently used struct member gets changed from int to u64.

After all, when looking at current practice, the kernel will support
being built with gcc 4.3 until 2013 or 2014.

> Jakub

cu
Adrian

--

"Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
"Only a promise," Lao Er said.
Pearl S. Buck - Dragon Seed