This is the mail archive of the
gcc@gcc.gnu.org
mailing list for the GCC project.
Re: Optimization: Conclusions from Evolutionary Analysis
Jan Hubicka wrote:
I would like to see -ftracer enabled for -O3, in fact I am quite
surprised it is not. We do that for a while in SuSE compilers and it
works fine.
>
It is not good -O2 candidate as it increase code size too much. It may
be, however, nice to enable it on -O2 only when -fbranch-probabilities
is present.
Everything I've seen suggests that -ftracer would be an effective
addition to -O3. -freduce-all-givs seems valuable in specific situation
on the P3, but I'm not certain it is generally applicable.
I'm trying to figure out the best hueristics for weighing both code size
and speed.
This is quite surprising, as I can measure pretty clean benefits on SPEC
benchmark running Opteron for instance. You need to use -mfpmath=sse
-msse2 (or -march=sse2_enabled_CPU) to get double precision arithmetics
in SSE. I also reproduced similar results for Pentium4 in the past.
Only CPU that does not seem to preffer SSE operations appears to be
PentiumM in my notebook (the hardware implementation is probably quite
poor)
Being stuck with a Pentium 4 (Northwood), I haven't been able to see how
the SSE options affect code on the Opteron. -msse is implied by
-march=pentium4; I've checked this, both in the GCC source code and by
comparing compiles with and without -msse.
How does this compare to -fomit-frame-pointer?
As others have pointed out, -fomit-frame-pointer is not enabled for the
P4, so no comparison can be made with -momit-leaf-frame-pointer.
The drawback is again the code size, so perhaps it can be -O3 only.
Or, perhaps, we could consider a -O4, meaning "I don't give a darn about
code size, just optimize like heck."
--
Scott Robert Ladd
Coyote Gulch Productions (http://www.coyotegulch.com)
Software Invention for High-Performance Computing
In development: Alex, a database for common folk