This is the mail archive of the
mailing list for the GCC project.
Re: Can we speed up the gcc_target structure?
Ian Lance Taylor <email@example.com> writes:
> "Zack Weinberg" <firstname.lastname@example.org> writes:
>> Furthermore, while a 3% measured speed hit is a concern, I think that
>> trying to win it back by undoing the targetm transformation - in the
>> object files, if not in the source code - is barking up the wrong
>> tree. Instead we should be looking for ways to avoid having targetm
>> hooks in critical paths in the first place. It's been my experience
>> that that is a much more fruitful source of optimization
> I don't have anything against that goal, but it is in conflict with
> the goal of speeding up the compiler. Simply moving targetm hooks
> obviously can not get you the full speedup. The full speedup comes
> when an optimizing compiler compiling gcc can see that certain values
> are constants, such as, in my example, the various promote_* functions
> in the target vector. You can't pull those target hooks out of the
> critical path. Function calls are on the critical path for a
> non-optimizing compilation of many types of C++ code, and a
> non-optimizing compilation is the case where compilation speed is the
> most important.
This - and further discussion downthread - misses the point I was
trying to make.
You're seeing 3% speedup on some test case by exposing that certain
elements of targetm.calls are compile-time constant. Here are the
existing elements of that structure:
bool (*promote_function_args) (tree fntype);
bool (*promote_function_return) (tree fntype);
bool (*promote_prototypes) (tree fntype);
rtx (*struct_value_rtx) (tree fndecl, int incoming);
bool (*return_in_memory) (tree type, tree fndecl);
bool (*return_in_msb) (tree type);
rtx (*expand_builtin_saveregs) (void);
/* Returns pretend_argument_size. */
void (*setup_incoming_varargs) (CUMULATIVE_ARGS *ca, enum machine_mode mode,
tree type, int *pretend_arg_size,
bool (*strict_argument_naming) (CUMULATIVE_ARGS *ca);
/* Returns true if we should use SETUP_INCOMING_VARARGS and/or
bool (*pretend_outgoing_varargs_named) (CUMULATIVE_ARGS *ca);
Furthermore, skimming tm.texi, it looks like there are at least a
hundred more function-call related target macros that haven't yet been
moved into the target vector: of which FUNCTION_ARG is an obvious
example, but there are also things like SPLIT_COMPLEX_ARGS,
PUSH_ARGS_REVERSED, FRAME_GROWS_DOWNWARD, ...
Of course inefficiencies are going to be introduced if we just convert
each macro to a target hook with the same semantics. But that isn't
the only option on the table. The right thing is to redesign this
interface so that it doesn't *need* 100+ macros and toggles. If this
is done properly, then not only should there be no inefficiency
introduced by going through the target vector, but also we would have
something that was straightforward to maintain and straightforward to
add support for new architectures.
As a data point, I am aware of another (proprietary) compiler that
completely isolates the back end from the optimizers, to the point
where the back end module can be swapped out at runtime, and it
benchmarks competitively or faster than GCC on similar input. So I
don't believe that this is impossible.