This is the mail archive of the
gcc@gcc.gnu.org
mailing list for the GCC project.
-O2 "pessimizations" on pentium2, egcs-19980921
- To: egcs at cygnus dot com
- Subject: -O2 "pessimizations" on pentium2, egcs-19980921
- From: N8TM at aol dot com
- Date: Sat, 26 Sep 1998 15:23:47 EDT
egcs-19980921 has cut down on the number of cases where -O2 and -Os perform
differently on pentium2. I made these comparisons using 'g77 -O[2s] -funroll-
loops -malign-double -march=pentiumpro'.
My remaining cases where -O2 is slower share the following sort of source
code optimization, where an attempt is made to force the compiler to re-use a
local copy of an array element (example taken from modified Livermore Kernel
2):
i= ipntp+1
rtmp= x(ipnt+1)
do k= ipnt+2,ipntp,2
i= i+1
rtmp1= x(k+1)
x(i)= x(k)-v(k)*rtmp-v(k+1)*rtmp1
rtmp= rtmp1
enddo
Here there is a false aliasing problem, which is worse if the code is not
written this way, as few compilers analyze this code to see that the values of
i and k do not overlap.
A situation in which the current snapshot is much slower than egcs-1.1 occurs
in Livermore Kernel 17:
scale= dw
rtmp= fw
e6= tw
do k= n,2,-1
e3= rtmp*vlr(k)+vlin(k)
xnei= vxne(k)
vxnd(k)= e6
xnc= scale*e3
C SELECT MODEL
if(rtmp <= xnc.and.xnei <= xnc)then
C LINEAR MODEL
ve3(k)= e3
rtmp= e3+e3-rtmp
vxne(k)= e3+e3-xnei
e6= rtmp
else
rtmp= rtmp*vsp(k)+vstp(k)
C STEP MODEL
vxne(k)= rtmp
ve3(k)= rtmp
e6= rtmp
endif
enddo
xnm= rtmp