This is the mail archive of the mailing list for the GCC project.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: gcc compile-time performance

<<This is a common misconception.  The problem is that sometimes (often)
the headers are the *vast* majority of the code in a single translation
unit.  (Often, more than 95%).  In order to be conformant, you must
not only lex and parse all of that code -- you must perform significant
semantic analysis.

Well of course that's true. It's true in Ada too. The point is that in
a modern compiler to my way of thinking, front end processing should be
extremely fast, and most of the time should be spent in the back end.
Certainly in the case of GNAT at -O2, the majority of time is spent in
the back end for almost all programs. Precompiled headers (in Ada 
precompiling specs) will typically only be able to help the front end
time. Furthermore, the ratio between disk speeds and processors increases,
so as time goes on the expense of writing and reading these precompiled
headers becomes high. On a modern gigahertz super scalar machine you
can execute an awful lot of instructions in the time it takes to read in
a moderate sized file. 

The following figures are from Ada, I don't know how C++ compares but I
would be surprised if it is that much different. is about
300K of source code, corresponding to about 8000 lines of Ada with comments
(perhaps half that without comments). The tree file (which is what would
correspond to a precompiled header -- it contains a full symbol table and
fully decorated semantic tree) is about 5 megabytes.

In the time it takes to read 5 megs of data, you can execute perhaps
1 billion instructions. That's quite a lot for 4000 source lines.

I don't know what the comparison of C++ and Ada is in this regard. Certainly
the tree file from Ada is nowhere near as compressed as it could be (it has
some moderate ad hoc compression, but if I zip it, the size comes down to 
about 2 megs. 

Perhaps this summer I will write a C++ equivalent of my fast Ada parser (that's
the one written in aggressive x86 assembler, not the one used in GNAT that
runs something like 100,000,000 lpm on a modern machine) and see how the two
languages compare just for lexical analysis and parsing. The semantic analysis
of course is far harder to analyze, but it still seems uncomfortable for it
to be taking a long time compared to optimized code generation.

I do agree that if you don't have a really fast front end, then precompiled
headers can be a big win, I am just not convinced that it is necessarily
a win for a front end written for maximum speed.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]