This is the mail archive of the
gcc@gcc.gnu.org
mailing list for the GCC project.
Re: g77 performance on ALPHA
- To: martin.kahlert@provi.de, egcs@egcs.cygnus.com
- Subject: Re: g77 performance on ALPHA
- From: N8TM@aol.com
- Date: Sat, 28 Aug 1999 11:40:24 EDT
In a message dated 8/28/99 3:28:13 AM PST, martin.kahlert@provi.de writes:
> So, how could g77 be improved most easily?
> Is it a problem with the Haifa-scheduler? It seems to me that gcc
> can't take that much advantage from unrolling like Compaq's compiler.
As your code shows, gcc/g77 schedulers, either the old one or Haifa, don't do
much for pipelined processors which lack out-of-order execution or shadow
register re-mapping. They do fairly well on several out-of-order processors.
For example, my g77 code speeded up more with a change from R10000 to R12000
processor than the MipsPro code did, and I'm getting generally 80% of the
performance from g77 which I get from MipsPro f90 with a great deal of array
syntax optimization.
The trend in some of the processor families has been to make use of OOO etc.
to enable fast execution of simple object code. As the gnu compilers are
required to perform well on such processors, including those which have a
small set of registers directly available to the program, I suppose it's
unlikely that good performance will be obtained where such specialized code
generation is needed. The Pentium II processor runs quite well with the
-mpentiumpro option, which is the one which generates code compatible with
processors back to 486, but without skewing scheduling toward the older
models. This is particularly true of code which doesn't spend much time in
tight little loops.
A trend which goes along with the development of processor families is that
the compilers are tuned for the model which has the largest customer base
paying current support contracts and buying compilers. For example, the
current MipsPro compilers have lost performance on the last Mips models which
lack OOO, even though the architecture switches are retained. g77 never got
even 40% of the speed of the chip vendor's compilers on those older models,
and I'm sure that the vendors weren't thinking of g77 when they developed
better hardware which depends less on code tuning.
If the resources were available, I would rather see effort spent on compiler
transformations such as outer unrolling than on targeting specific CPU models.
Tim
tprince@computer.org