This is the mail archive of the
mailing list for the GCC project.
Re: gcc compile-time performance
- From: dewar at gnat dot com (Robert Dewar)
- To: dewar at gnat dot com, jss at ast dot cam dot ac dot uk, mark at codesourcery dot com
- Cc: gcc at gcc dot gnu dot org
- Date: Sat, 18 May 2002 15:59:40 -0400 (EDT)
- Subject: Re: gcc compile-time performance
<<This is a common misconception. The problem is that sometimes (often)
the headers are the *vast* majority of the code in a single translation
unit. (Often, more than 95%). In order to be conformant, you must
not only lex and parse all of that code -- you must perform significant
Well of course that's true. It's true in Ada too. The point is that in
a modern compiler to my way of thinking, front end processing should be
extremely fast, and most of the time should be spent in the back end.
Certainly in the case of GNAT at -O2, the majority of time is spent in
the back end for almost all programs. Precompiled headers (in Ada
precompiling specs) will typically only be able to help the front end
time. Furthermore, the ratio between disk speeds and processors increases,
so as time goes on the expense of writing and reading these precompiled
headers becomes high. On a modern gigahertz super scalar machine you
can execute an awful lot of instructions in the time it takes to read in
a moderate sized file.
The following figures are from Ada, I don't know how C++ compares but I
would be surprised if it is that much different. Atree.ads/adb is about
300K of source code, corresponding to about 8000 lines of Ada with comments
(perhaps half that without comments). The tree file (which is what would
correspond to a precompiled header -- it contains a full symbol table and
fully decorated semantic tree) is about 5 megabytes.
In the time it takes to read 5 megs of data, you can execute perhaps
1 billion instructions. That's quite a lot for 4000 source lines.
I don't know what the comparison of C++ and Ada is in this regard. Certainly
the tree file from Ada is nowhere near as compressed as it could be (it has
some moderate ad hoc compression, but if I zip it, the size comes down to
about 2 megs.
Perhaps this summer I will write a C++ equivalent of my fast Ada parser (that's
the one written in aggressive x86 assembler, not the one used in GNAT that
runs something like 100,000,000 lpm on a modern machine) and see how the two
languages compare just for lexical analysis and parsing. The semantic analysis
of course is far harder to analyze, but it still seems uncomfortable for it
to be taking a long time compared to optimized code generation.
I do agree that if you don't have a really fast front end, then precompiled
headers can be a big win, I am just not convinced that it is necessarily
a win for a front end written for maximum speed.