This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: A FrontEnd in C++?


On Monday 19 August 2002 08:55 am, Alexandre Oliva wrote:
> On Aug 19, 2002, Zack Weinberg <zack@codesourcery.com> wrote:
> > Based on the number of problems we have encountered with the Ada front
> > end compared to the others, I think that as a matter of project policy
> > writing front ends in any language other than C should be discouraged.
>
> I disagree.  The problems have all been caused by the front end
> requiring a pre-existing, compatible build of itself to build.  Using
> C++ to create a front end for any language other than C++ wouldn't
> bring in any new problems.

I have to agree with Alexandre.
And this has a lot of bearing on the recent hot thread: "Faster compilation 
speed".

Based on current Makefile usage...
Referring to a 3-stage build as stage0, stage1, stage2 with stage0 currently 
unlabeled...

The (intended) purpose of the stage0 phase is to build a "build compatible" 
compiler from whatever is at hand.
Historically: "any 'ol C compiler"
Recently, this has been expanded "any 'ol C compiler" plus "any 'ol Ada 
compiler" (if they are link compatible or stage0 binutils can be built to 
make them so).

During my "pre-thinking" of revising the makefiles...
I don't see any reason (from the stand point of the makefiles, that is) why 
this can't be generalized to: "any 'ol collection of compilers for any 'ol 
collection of languages"

How does this apply to speeding up the compiler? - Qood question, I'm getting 
there...

Suppose the folks doing all of the work on optimizing code, optimizing 
register usage, etc; come to an agreement such as:
"Doing all of this in C is giving us:
1) poor spacial (sp?) locality for the code, it needs to be better.
2) the spacial locality of the data really sucks
let us rewrite these parts in lanuage xyz".

lisp? elist? RTL_the_language? whatever... don't get focused on any one name 
for now...

Consider for now, any of the many languages which can be implemented as an 
"extensible, threaded, interpertive" language. (I am talking late 1970's 
technology here.)

Functionally, the implemenation would:
1) translate source code -> bytecode (such as "precompile elisp")
2) on receipt of specific command code: 
    2a) traverse a function's byte code, optimize out (I.E: inline) all of 
the calls to its predcessor functions -> one bigger chunk of byte code.
    2b) do a few, simple "peephole" optimizations.
    2c) rather than interpret this now bigger, optimized chunk of byte code; 
use the byte code interpreter's look-up table to look-up the corresponding, 
native, binary, function calls (all of which are written PIC) - strip the 
pre-amble, post-amble, copy the remaining instructions somewhere - call it an 
object code library if you like.

You don't have to spend a lot of time making the above a real "killer 
language app" - it's lifetime will be very short.

I haven't totally lost my mind here - I did several of these about twenty 
years ago - it isn't any bigger a challange than you let it become.

What does that achive?
Considering how the binary was generated, you have done spacial (sp?) 
locality compression.  I.E: by coaxing the conversion process a little, you 
can force I-cache line(s) sizes of instruction chunks.

Now comes the speculation part, which I didn't do in those old TIL's...

Bytecode the data references the TIL language uses.
Follow the same translate, combine, simplifiy, convert.  Converting to 
"native", "packed" data structures.  I.E: eleminate as much "pointer chasing" 
as possible.

So that does spacial (sp?) locality compression for the data.  Here, again, 
with a little coaxing, build D-cache line(s) sizes of data structure chunks.

Now the bootstrap process becomes (adding a "partial" stage):

stage0a: 
generate a "build compatible", approximation of GCC using "any 'ol C 
compiler" and the existing trees, in-liners, etc. plus the source (in C) of 
the "self compiling" approximation of the language the new trees, in-liners, 
optimizers are written in.

stage0b: 
toss the existing sources for trees, in-liners, etc; substituting the sources 
(in the new, whatever, language) for the new replacements.  Generating a 
"fully compiled", GCC-C compatible, New Optimizer compatible; "build 
compiler".

stage-next: Build GCC with the user specified options, features, and 
languages.

Continue per existing build process.

See - that bytecode thing didn't live very long (minutes, probably).

Such a process would allow even the "internals" of the GCC-C compiler be 
built from multiple languages.  I think it may be the direction to take in 
speeding up the compiler.

Mike


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]