C++: EGCS perf. vs GCC

John S. Dyson dyson@iquest.net
Mon Dec 7 11:41:00 GMT 1998


Jeffrey A Law said:
> 
>   > I understand there is a Pentium compiler around (pcc or something); will
>   > it do any difference? (or this is the stuff on which the Pentium support
>   > in gcc/egcs is based?)
> We are slowly, but surely moving the major features from pgcc into egcs.  Will
> pgcc help your code?  Maybe, maybe not.
                        ^^^^^^^^^^^^^^^^
> 
> 
>   > And finally, is Pentium II = Pentium Pro+MMX+better L2 cache? this why I
>   > hoped -mcpu=pentiumpro will unleash the demons living in my machine :-)
> My understanding is that from a scheduling standpoint PII == PPro.  So what
> you did makes sense.  The problem is the x86 port is just not designed to
> work with an instruction scheduler.  Thus performance when scheduling varies
> wildly.
> 
It seems that scheduling with the P6 is sometimes counter intuitive.  One
can often gain significant performance improvements, but those improvements
are very sensitive to alignment and "pipeline" state.  I suggest reviewing
carefully:

	http://www.announce.com/agner/assem

This is the most accurate and informative document that I have seen on X86
optimization.  There are older versions of that doc floating around, but the
1 Aug 98 version is fantastic.  It contains some info that I already knew, but
in a very clear and concise form -- and lots of info that I didn't already know.

I have played around with hand optimizing X86 (specifically P6) code, and it
is a "trip." :-).  IMO, on the P6, minimize partial register stalls (they are
really ugly -- however there are ways of repairing the damage), and be somewhat
careful about alignment.  It is also a good thing to unroll very short loops.
The normal kinds of scheduling don't seem to apply to the P6.  For almost any
floating point codes that I have played with, including circuit analysis and
transforms, I suggest not using omit frame pointer, and it is very important
to align doubles when possible.

When playing with GCC, PGCC, and EGCS, it appears that the "good" optimization
options for FP codes (on the X86) are not the same as for non-FP.  I guess that
is to be expected because of the necessarily different code generator for 
X86 FP.

-- 
John                  | Never try to teach a pig to sing,
dyson@iquest.net      | it makes one look stupid
jdyson@nc.com         | and it irritates the pig.



More information about the Gcc-bugs mailing list