This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]

-O2 "pessimizations" on pentium2, egcs-19980921

To: egcs at cygnus dot com
Subject: -O2 "pessimizations" on pentium2, egcs-19980921
From: N8TM at aol dot com
Date: Sat, 26 Sep 1998 15:23:47 EDT

egcs-19980921 has cut down on the number of cases where -O2 and -Os perform
differently on pentium2.  I made these comparisons using 'g77 -O[2s] -funroll-
loops -malign-double -march=pentiumpro'. 

 My remaining cases where -O2 is slower share the following sort of source
code optimization, where an attempt is made to force the compiler to re-use a
local copy of an array element (example taken from modified Livermore Kernel
2):

	  i= ipntp+1
     rtmp= x(ipnt+1)
	  do k= ipnt+2,ipntp,2
		  i= i+1
		  rtmp1= x(k+1)
		  x(i)= x(k)-v(k)*rtmp-v(k+1)*rtmp1
		  rtmp= rtmp1
		enddo

Here there is a false aliasing problem, which is worse if the code is not
written this way, as few compilers analyze this code to see that the values of
i and k do not overlap.

A situation in which the current snapshot is much slower than egcs-1.1 occurs
in Livermore Kernel 17:

	  scale= dw
	  rtmp= fw
	  e6= tw
	  do k= n,2,-1
	      e3= rtmp*vlr(k)+vlin(k)
	      xnei= vxne(k)
	      vxnd(k)= e6
	      xnc= scale*e3
C	                                     SELECT MODEL
	   if(rtmp <= xnc.and.xnei <= xnc)then
C	                                     LINEAR MODEL
		  ve3(k)= e3
		  rtmp= e3+e3-rtmp
		  vxne(k)= e3+e3-xnei
		  e6= rtmp
		else
		    rtmp= rtmp*vsp(k)+vstp(k)
C	                                     STEP MODEL
		    vxne(k)= rtmp
		    ve3(k)= rtmp
		    e6= rtmp
		endif
	 enddo
	  xnm= rtmp

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]