This is the mail archive of the
gcc@gcc.gnu.org
mailing list for the GCC project.
Re: Pre-compiled headers
- To: Zack Weinberg <zack at wolery dot cumb dot org>
- Subject: Re: Pre-compiled headers
- From: Per Bothner <per at bothner dot com>
- Date: 12 Jan 2000 11:06:28 -0800
- Cc: gcc at gcc dot gnu dot org
- References: <20000111201227.A18872@wolery.cumb.org>
Zack Weinberg <zack@wolery.cumb.org> writes:
> So, Cygnus has contracted me to implement precompiled headers.
I think we should definely finish cpplib and amke it the default,
and then fix the C/Obj-C/C++ lexers to only tokenize *once* (i.e. have
cpplib do the toknization, and then have the lexer use the resulting
token). Related to that: Perhaps macro names should be stored in the
same symbol table as identifiers, so we don't have to lookup each
identifier twice.
But beyond that, perhaps pre-compiled headers is not the best
solution. What we want to do is speed up compilation, specifically
speed up compilation of large projects with many header files and
many program files. Instead of speeding up compilation using
pre-compiled header files, what about *amortizing* header file
handling over many *program* files?
Right now,
gcc -c foo1.c foo2.c foo3.c
Results in:
cc1 foo1.c && as foo1.s
cc1 foo2.c && as foo2.s
cc1 foo2.c && as foo2.s
Could we instead do:
cc1 foo1.c foo2.c foo3.c
as foo1.s
as foo2.s
as foo3.s
That is, change cc1 (cc1plus) so it can process multiple source
files in one invocation. If each of fooX.c does
#include "foo1.h"
#include "foo2.h"
#include "foo3.h"
#include "bar1.h"
#include "bar2.h"
then the compiler gets to foo2.c, it can remember that it has
already seen and processed foo1.h etc.
One tricky part is that the compiler must re-initialize its state
when it is done compiling foo1.c and read for foo2.c. Howewer, it
must still keep around the tree nodes produced by header files it
has seen. To do that, it keeps the actual tree nodes around,
but it removes the binding between the IDENTIFIER_NODEs and the
declared macros/variables/classes/etc. For each header file it has
seen it remembers the names it defines, so then when it sees the
same header file included by the next compilation unit, it can
re-establish the bindings.
Advantages:
(1) No setup (pre-compilation pass) needed.
(2) Simple to use. Source files don't have to be changed.
Makefiles or build scripts may need to be changed to take
advantage of this.
(3) Likely substantially better speedup than other approaches.
(Note this approach does not preclude pre-tokenizing header
files in addition.)
(4) Some changes of the compiler needed, but they seem like
desirable cleanups. (Some of the changes are may be similar to what
we would want to turn the backend into a library.)
--
--Per Bothner
per@bothner.com http://www.bothner.com/~per/