Speed up genattrtab

Jan Hubicka hubicka@ucw.cz
Thu Jun 17 11:50:00 GMT 2010


> On Wed, Jun 16, 2010 at 02:22:58PM -0700, Mark Mitchell wrote:
> > Jakub Jelinek wrote:
> > 
> > > 2010-06-16  Jakub Jelinek  <jakub@redhat.com>
> > > 
> > > 	* Makefile.in (cfgexpand.o): Depend on $(INSN_ATTR_H).
> > > 	* genattrtab.c (check_tune_attr, find_tune_attr): New functions.
> > 
> > This should have no impact on compile-time for things compiled with GCC,
> > correct?  If so, for avoidance of doubt, while I haven't reviewed the
> > patch in detail, I certainly have no objections to it.  Let me know if
> 
> It has compile time impact, but a mixed one.  The negative performance
> impact is that internal_dfa_insn_code and insn_default_latency calls
> are no longer direct function calls (on the targets which have some cpu/tune
> attribute tested in all reservations), but are function pointers and thus
> indirect calls.  The pointer isn't changing much usually though (unless
> optimized/target attribute/pragma is used, it shouldn't change at all).
> How much this costs depends on the host CPU (and whether the host CPU
> is able to cache target CPU if the fn pointer isn't changing;
> currently the init_sched_attrs call which is called once per function
> always writes the fn pointers, usually with the same value as it already has
> - would it help for some host CPUs if the function instead computed the
> fn pointers into temporary variable and wrote the fn pointer var only
> if the temporary is different from its current contents?).

I think in general we took the way of function pointers instead of 
macro machinery with direct calls even in hot parts of program (we
have targhooks in general_operand and friendds; dataflow branch
has indirect calls in internal loop etc.).

So I would not worry about this particular case, it is not worse than
existic practices.
> 
> The advantage is that the text size of the functions shrinks a lot
> (at least on the architectures I've looked at - i?86/x86_64, powerpc{,64}
> and s390{,x} the .text size of all the per tuning functions together
> is smaller than the .text size of the old monster functions, the sum of
> all the per tuning function .rodata sizes (jump tables) usually slightly
> grew, but still for each individual function both sizes are much smaller),
> which means that unless optimize/target attribute is used heavily and every
> function uses different tuning, the new code is much more i-cache and
> d-cache friendly.  Plus, many extract_insn_cached or
> extract_constrain_insn_cached calls could go away - if say only one tuning
> was interested in that additional info and all others don't care for
> some particular insn, the new code will call it only in the function
> for the tuning that needs it and not in the other tuning functions.

Oprofiling the compilatio of small files even with LTO linked binary,
we do have a lot of system overhead (it is over 70% for empty file compilation).

I guess cost of mmpapping large binaries + the memory dirtified by startup
accounts a lot here.

Honza
> 
> The last arch I've looked at was arm - there the patch doesn't make any
> difference (except for #define init_sched_attrs() do { } while (0) in
> insn-attr.h) because arm currently doesn't have a single const attribute
> that is used in tests for all reservations.  The solution there could be
> to create a new const attribute that would combine the attributes currently
> used in define_insn_reservation tests (say a bitfield containing the other
> attrs), will leave that to arm maintainers if they wish to do that.
> 
> 	Jakub



More information about the Gcc-patches mailing list