2009-04-22 09:02:35

by Ingo Molnar

[permalink] [raw]
Subject: [rfc] built-in native compiler for Linux?


* Steven Rostedt <[email protected]> wrote:

> I think it was Ingo that let out the idea, and I'm starting to
> like it.
>
> Perhaps we should fork off gcc and ship Linux with its own
> compiler. This way we can optimize it for the kernel and not worry
> about any userland optimizations.

I didnt suggest forking GCC. A kernel-special GCC would likely just
become an inferior fork of GCC over time and would fizzle out.
There's 100 times more user-space code than kernel-space code and
GCC is too large and too legacy-laden to really be appropriate for
that purpose.

What i think makes sense is to build a _new_ precompiler / compiler
/ assembler / linker combo for Linux, from scratch, hosted in the
kernel proper.

In the past 15 years of Linux we've invested a lot of time and
effort into working around and dealing with compiler crap. We wasted
a lot of opportunities waiting years for sane compiler features to
show up. We might as well have invested that effort into building
our own compiler and could stop bothering about externalities. The
Linux kernel project certainly involves the right kind of people who
could make something like this happen.

A good technical basis for that would be Sparse, and it could start
by acting as a drop-in replacement for CPP and it could feed its
output to GCC with little changes. Sparse is small, has a very tidy
code base and is already useful today as an extended static source
code checker.

The Sparse codebase could move into the kernel proper, under
linux/sparse/ or so - so the preprocessor/compiler and the kernel
could be in precise feature and bugfix lock-step with no artificial
external synchronization.

We have a lot of annoying preprocessor limitations that Sparse could
help with straight away. We'd also get Sparse type checking by
default. So it's helpful even without any code generator support.

Then, if this model works out, we could experiment with adding a
code generator backend to Sparse. I think Jeff Garzik experimented
with that in the past with some surprisingly quick (but incomplete)
results.

Since most of the performance-critical code in Linux is
hand-optimized already, we dont even need all that many complex,
exotic optimizations - we want to encourage common-sense coding
practices. Furthermore, a lot of optimizations in GCC are driven by
SPECint and SPECfp benchmarketing, with little practical relevance
to 99% of the apps, including the kernel.

There would always be an 'output to GCC' kind of compatible build
channel as well, for CPU architectures that dont have native code
generator support yet. We'd also do that to generally keep our
options open, in case we are wrong about it all or in case some even
better compiler project pops up.

Ingo


2009-04-23 20:39:16

by Anton Ertl

[permalink] [raw]
Subject: Re: [rfc] built-in native compiler for Linux?

Ingo Molnar <[email protected]> writes:
> Furthermore, a lot of optimizations in GCC are driven by
>SPECint and SPECfp benchmarketing, with little practical relevance
>to 99% of the apps, including the kernel.

Right, and it's not just the kernel that has to work around GCC
breakage, it's also many apps (possibly all of them except SPEC
benchmarks, and the few 100% C99-compliant programs (probably none
with more than than 500 lines of code)).

So I guess a decent alternative to GCC would attract not just the
kernel people, but also many (most?) applications people. Then GCC
would no longer be misused for ordinary applications or kernels and it
could focus on its true destiny: compiling SPEC benchmarks.

Whether to build such a compiler mostly from scratch as you suggest or
as a fork of GCC is up to whoever does it. The GCC fork variant has
the advantage of having a lot of targets available already, but the
disadvantage of a large and complex code base.

- anton
--
M. Anton Ertl Some things have to be seen to be believed
[email protected] Most things have to be believed to be seen
http://www.complang.tuwien.ac.at/anton/home.html

2009-04-25 05:30:14

by David Miller

[permalink] [raw]
Subject: Re: [rfc] built-in native compiler for Linux?

From: Christoph Lameter <[email protected]>
Date: Fri, 24 Apr 2009 14:34:49 -0400 (EDT)

> On Wed, 22 Apr 2009, Ingo Molnar wrote:
>
>> What i think makes sense is to build a _new_ precompiler / compiler
>> / assembler / linker combo for Linux, from scratch, hosted in the
>> kernel proper.
>>
>> A good technical basis for that would be Sparse, and it could start
>> by acting as a drop-in replacement for CPP and it could feed its
>> output to GCC with little changes. Sparse is small, has a very tidy
>> code base and is already useful today as an extended static source
>> code checker.
>
> A new preprocessor would be great. If we can make sparse do what CPP does
> now then lets go for it.

It's just too bad that we'll lose the performance gained from the
fact that GCC's CPP is linked into the C compiler binary and thus
doesn't have to transfer it's result over a pipe or anything like
that.

I really think this whole idea isn't a very smart one. I would
rather have whoever would be working on a sparse backend instead
be working on kernel improvements.

I also think people underestimate how much work this would be.

2009-04-25 07:06:38

by Eric W. Biederman

[permalink] [raw]
Subject: Re: [rfc] built-in native compiler for Linux?

David Miller <[email protected]> writes:

> From: Christoph Lameter <[email protected]>
> Date: Fri, 24 Apr 2009 14:34:49 -0400 (EDT)
>
>> On Wed, 22 Apr 2009, Ingo Molnar wrote:
>>
>>> What i think makes sense is to build a _new_ precompiler / compiler
>>> / assembler / linker combo for Linux, from scratch, hosted in the
>>> kernel proper.
>>>
>>> A good technical basis for that would be Sparse, and it could start
>>> by acting as a drop-in replacement for CPP and it could feed its
>>> output to GCC with little changes. Sparse is small, has a very tidy
>>> code base and is already useful today as an extended static source
>>> code checker.
>>
>> A new preprocessor would be great. If we can make sparse do what CPP does
>> now then lets go for it.
>
> It's just too bad that we'll lose the performance gained from the
> fact that GCC's CPP is linked into the C compiler binary and thus
> doesn't have to transfer it's result over a pipe or anything like
> that.
>
> I really think this whole idea isn't a very smart one. I would
> rather have whoever would be working on a sparse backend instead
> be working on kernel improvements.
>
> I also think people underestimate how much work this would be.

>From what little I have seen of this conversation I would have
to agree. I have written my own C compiler in roughly the manner
proposed in the original email and I would be happy to discuss how
much work it really is if someone is interested.

If this was about teaching sparse to run lockdep at compile time, or
generally about making the kernel compilation much faster and able to
catch many more bugs there might be a point where the effort is worth
the investment.

Eric

Subject: Re: [rfc] built-in native compiler for Linux?

On Sat, 25 Apr 2009, Eric W. Biederman wrote:

> If this was about teaching sparse to run lockdep at compile time, or
> generally about making the kernel compilation much faster and able to
> catch many more bugs there might be a point where the effort is worth
> the investment.

The preprocessor and its interaction with regular C code is quite
nasty. If sparse could get rid of the complexities and idiosyncrasies at
that level then it may be useful as a "pre" compiler.

2009-04-27 17:34:39

by Al Viro

[permalink] [raw]
Subject: Re: [rfc] built-in native compiler for Linux?

On Mon, Apr 27, 2009 at 09:52:46AM -0400, Christoph Lameter wrote:
> On Sat, 25 Apr 2009, Eric W. Biederman wrote:
>
> > If this was about teaching sparse to run lockdep at compile time, or
> > generally about making the kernel compilation much faster and able to
> > catch many more bugs there might be a point where the effort is worth
> > the investment.
>
> The preprocessor and its interaction with regular C code is quite
> nasty. If sparse could get rid of the complexities and idiosyncrasies at
> that level then it may be useful as a "pre" compiler.

Explain, please. BTW, at the risk of being called an elitist bastard, could
I ask the participants of that thread to read C99 standard? It's not hard
to find (http://www.open-std.org/jtc1/sc22/WG14/www/docs/n1256.pdf, for one
thing - that's C99 + errata) and at least chapter 5 and 6.10 are really
must-read if we are talking about that stuff.

In particular, C preprocessor does *NOT* work on text-to-text level and
hasn't since way back. It works on token stream.

That's actually one of the areas where C99 is a huge improvement over
earlier language - instead of more or less nasty kludges still trying
to pretend that preprocessor was a filter with text output piped into
compiler, it gives reasonably clear semantics approximating what the
earlier variants had in common.

I'd done fairly complete rewrite of macro expansion (and a bunch of other
places in preprocessor) in sparse, and places where it used to deviate from
standard were by far the worst in terms of convoluted logics and corner
cases. Switching to what C99 asked had simplified the things a *lot*;
it's surprisingly well thought through in that area (unlike e.g. the unholy
mess around 'restrict' qualifier semantics).

2009-04-27 18:41:52

by Eric W. Biederman

[permalink] [raw]
Subject: Re: [rfc] built-in native compiler for Linux?

Al Viro <[email protected]> writes:

> On Mon, Apr 27, 2009 at 09:52:46AM -0400, Christoph Lameter wrote:
>> On Sat, 25 Apr 2009, Eric W. Biederman wrote:
>>
>> > If this was about teaching sparse to run lockdep at compile time, or
>> > generally about making the kernel compilation much faster and able to
>> > catch many more bugs there might be a point where the effort is worth
>> > the investment.
>>
>> The preprocessor and its interaction with regular C code is quite
>> nasty. If sparse could get rid of the complexities and idiosyncrasies at
>> that level then it may be useful as a "pre" compiler.
>
> Explain, please. BTW, at the risk of being called an elitist bastard, could
> I ask the participants of that thread to read C99 standard? It's not hard
> to find (http://www.open-std.org/jtc1/sc22/WG14/www/docs/n1256.pdf, for one
> thing - that's C99 + errata) and at least chapter 5 and 6.10 are really
> must-read if we are talking about that stuff.
>
> In particular, C preprocessor does *NOT* work on text-to-text level and
> hasn't since way back. It works on token stream.
>
> That's actually one of the areas where C99 is a huge improvement over
> earlier language - instead of more or less nasty kludges still trying
> to pretend that preprocessor was a filter with text output piped into
> compiler, it gives reasonably clear semantics approximating what the
> earlier variants had in common.

Well it came in C89, not C99. Which could well be the reason the
specification is clear and reasonable.

> I'd done fairly complete rewrite of macro expansion (and a bunch of other
> places in preprocessor) in sparse, and places where it used to deviate from
> standard were by far the worst in terms of convoluted logics and corner
> cases. Switching to what C99 asked had simplified the things a *lot*;
> it's surprisingly well thought through in that area (unlike e.g. the unholy
> mess around 'restrict' qualifier semantics).

Thank you. I can't even imagine someone writing a C Preprocessor with
K&R semantics at this late date.

Eric