This is the mail archive of the
gcc@gcc.gnu.org
mailing list for the GCC project.
[RFC] Let's kill specs, completely rewrite gcc.c
- To: gcc at gcc dot gnu dot org
- Subject: [RFC] Let's kill specs, completely rewrite gcc.c
- From: Neil Booth <neilb at earthling dot net>
- Date: Sun, 7 Jan 2001 13:14:42 +0000
- Cc: "Chris G . Demetriou" <cgd at sibyte dot com>
Over the last 3 months, changes to cpplib have required updating the
SPECS handling in gcc.c. For example, to fix the meaning of '|', and
to preserve order of -D and -U on the command line. I think yet
another spec may be required, similar to what Nathan posted last week,
for me to handle -MD and -MMD correctly - path preservation but suffix
replacement.
A few mails last November with Chris Demetriou inspired me to think of
better way of doing things. It strikes me that most patches to gcc.c
are simply kludges on top of an already gross kludge, and that the
spec-parsing part of gcc.c is hard to follow. However, I don't know
the historical reasons for specs being the way they are, so I may be
missing something.
Other reasons to replace specs:-
a) They are grossly inefficient. For example, the common SPEC for CPP
contains 50 sub-specs now. Each of these sub-specs requires a
complete scan of every command-line option. This amounts to an
enormous amount of strcmps, when you have an average of 15 to 20
command line options. And that's just the common spec for CPP - it
doesn't include the additional target-specific CPP specs, or the specs
of the other "compilers", or that every command line option goes
through another 20 or so strcmps for gcc.c special things like
"-print-file-name", "-ftarget-help", and maybe yet more for option
remapping etc.
b) They are inflexible - most processing needs to be expressed in
terms of specs formulae, or kludged some other way. I'm thinking of
things like pipe handling, preserving -U and -D ordering, "GNU C does
not support -C without using -E" error messages etc. here. Another
good example here is that cpplib accepts a whole host of options that
tradcpp doesn't, e.g. -ftabstop=, but we have no way of telling "gcc"
to pass them on to cpplib but not to tradcpp that isn't ridiculously
convoluted. If you need one more, we pass the -W options on to CPP as
{W*}. In order for this to work, CPP has to be coded to silently
accept -W options that it doesn't understand. It would be nice if CPP
could give an error instead.
c) Chris said they make target or processor-specific configuration in
the config/* files awkward in some cases. I'm no expert, so I'll take
his word for that :-)
I would like to suggest a different approach, along the following
rough lines:
1) gcc.c contains a sorted table of all command-line options that compilers
it drives understands. This might need to be provided by a separate
".tab" file generated at compile time and #included into gcc.c, for
reasons of flexibility over front-ends.
2) each switch in the table is flagged for such things as which front-ends
understand it, whether it takes an appended argument, a separate
argument, or both, etc.
3) for each command line switch, gcc.c uses this information to do a binary
search in the table for that switch. It extracts any argument as
appropriate. [code to do this with slightly less generality
already exists in cppinit.c]
4) Each possible switch has a usage count. During stage 3, this is
incremented as the command line is scanned. Also, each command-line
argument is flagged with which front-ends understand it, and whether
it is a switch or argument, etc.
5) gcc.c works out which compiler chain is needed to perform the compilation,
much as it does now from file extensions. Virtualization through
hooks might be needed here.
6) With 4) and 5), any redundant or unused switches are easily scanned
for in a single pass and complained about.
7) Each compiler has a hook, and it is passed the flagged command-line
argument list from 4). This, and the usage counts in 4), makes it
easy to do cleanly extra processing of the kind currently handled
by specs like
%{ffast-math:-D__FAST_MATH__}
%{MMD:-MM -MF %b.d}
At the same time, the compiler-specific hook extracts all the
switches flagged for use by its compiler, builds the relevant
command line, and invokes its compiler.
I believe the above scheme (where a few details are omitted),
encapsulates the full functionality provided by specs at present, and
with the hooks provides more useful flexibility. I would be very
disappointed if it didn't give an order of magnitude speed-up, cut the
size of gcc.c at least in half, and make gcc.c more comprehensible.
I'm volounteering to do this. I'm interested in whether others agree
this is a good and workable plan, things I've missed, difficulties I
might encounter, any reasons why this wasn't done originally, etc.
Particularly when it comes to working across the differing targets.
I realise this would cause a big shakeup of files under the config/
directory at the same time, and is not a GCC 3.0 thing.
Thanks!
Neil.