This is the mail archive of the
mailing list for the GCC project.
Re: gcc compile-time performance
- From: Jan Hubicka <jh at suse dot cz>
- To: Robert Dewar <dewar at gnat dot com>
- Cc: Richard dot Earnshaw at arm dot com, jh at suse dot cz, gcc at gcc dot gnu dot org
- Date: Mon, 20 May 2002 16:58:59 +0200
- Subject: Re: gcc compile-time performance
- References: <20020520145240.272E6F28CC@nile.gnat.com>
> <<Native floating point code is problem, unfortunately since for i386 you get
> different results in optimized and non-optimized builds breaking bootstrap.
> Fixed point code is problem, as we are interested in comparisons relative
> to the highest fequency in the program. This may be the entry block for
> tree-structured function, but it may be the internal loop and at high loop
> nests there is more than 2^30 differences between these two we can affort
> in integral arithmetics.
> Why not just use 64-bit scaled integer arithmetic, sounds like it would
> work fine in this case, and it will be a lot faster than emulated fpt
Say that I am interested in relative results in range 0-10000 (the current
setting). It is easy for tree-like function to have values well bellow 1 and
sum up back into important results - see insn-attrtab, where majority of blocks
in conditionals have frequency 0 (rounded), yet the then edge is 1000, just
because the conditionals are huge.
Similary, if every loop is predicted to iterate 8 times, we will overflow
at loop nest depth of 16, even with the current range (10000). Such functions
exists in practice, unfortunately. The loops do no iterate so extreme amount
of times, but we need to estimate that (or at least I don't see good scheme
to avoid it). The loop nest issue will become more serious if intraprocedural
propagation will take a place.
Another alterantive I was thinking about is to keep all values normalized,
so if we will be getting to overflow, we will simply shift all the computed
value by 2, but this scheme is not very sensitive to the local behaviour of