This is the mail archive of the
mailing list for the GCC project.
Can we speed up the gcc_target structure?
- From: Ian Lance Taylor <ian at wasabisystems dot com>
- To: gcc at gcc dot gnu dot org
- Date: 18 Jan 2004 03:37:38 -0500
- Subject: Can we speed up the gcc_target structure?
Back in the old days, gcc had a lot of code which was conditionally
compiled with #ifdef. That was ugly, but the resulting code was fast.
Over time, a lot of the parameters checked with #ifdef were converted
into macros which were checked at runtime using if. That was less
ugly, and, since the macros normally had constant values, when gcc was
compiled with an optimizing compiler, the code was just as fast in the
normal case. When it was slower, it was generally because the
compiler was doing something it couldn't do before.
More recently, some of those parameters have moved into the gcc_target
structure. They are still checked at run time, but now the if
condition never has a constant value. It always requires fetching a
value from memory in the target vector, and often requires calling a
function. This results in cleaner, more comprehensible code.
However, it also slows the compiler down.
Just for fun, I converted every instance of
instead. Then I added stuff like this to the end of target.h:
#define TARGETM_CALLS_PROMOTE_FUNCTION_ARGS(FNTYPE) \
Then I added stuff like this to i386.h:
#define TARGETM_CALLS_PROMOTE_FUNCTION_ARGS(FNTYPE) false
Then I rebuilt the compiler and tried it on some reasonably small C++
example (with a native i386 GNU/Linux compiler). I saw compilation
speedups of up to 3% when compiling without optimization. The
resulting assembler output was, as expected, identical.
These tests were far from rigorous. However, compilation speed is a
concern these days, and this suggests that the target vector is a
measurable speed problem.
Somebody must have noticed this before, but I couldn't find anything
in the gcc mailing list.
It seems to me that we should try to find a way to regain the speed
which was lost when we switched to the target vector, without losing
the comprehensibility which was gained.
Here is a sketch of a possible approach which would require fairly
minimal changes in the way the target vector works today:
1) Turn hooks.c and targhooks.c into .h files which define inline
functions (with appropriate fallbacks to support older non-gcc
compilers for bootstrapping, of course).
2) Move all definitions of target initializer macros from tm.c files
into new CPU-target.h files.
3) Include CPU-target.h at the end of target-def.h, where it will
redefine and undefine target initializer macros. For cases in
which targetm is changed at run time, CPU-target.h must #undef the
corresponding initializer macro (and CPU.c must #define it before
initializing targetm) (alternatively, force targetm to be const,
and adjust the relatively few cases in which it is changed at run
4) Change all uses of targetm.xxxx into code which uses TARGETM_XXXX
macros, as above.
5) Define the TARGETM macros as either using the target vector or
using the initializer macro from target-def.h. The choice would be
made based on whether the initializer macro was defined and
probably based on some other control.
6) Now code which uses the new inline versions of hooks.c and
targhooks.c, and which includes target-def.h and CPU-target.h, will
automatically use the inlined versions of the functions when
possible, and will see constant variable definitions when possible.
The main problem that I see with this approach is the requirement to
#undef an initializer macro which is changed at run time. That's why
I suggest the alternative of making targetm const.
We can convert to this approach over time if we require a particular
macro to be defined in order to define the TARGETM macros as using the
initializer macros rather than the target vector. Then a backend
which has been converted to use CPU-target.h would define that macro.
If we eventually want to configure gcc to support multiple target
vectors, that would still be possible. When more than one target
vector was to be supported, the code would force the TARGETM macros to
always use the target vector. This would be determined at configure
I considered more complex approaches, such as creating a target.def
file which defined the target vector, but the basic problem boils down
to detecting when the target does not use the default version of a
target vector field. Inventing CPU-target.h seems as effective an
approach as any to solving this particular problem.
Any thoughts? Does anybody think this would be a waste of time? Does
anybody have a better approach to solving the general problem?