This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: (a+b)+c should be replaced by a+(b+c)


On Thu, 25 Mar 2004, Robert Dewar wrote:

> Joost VandeVondele wrote:
>
> > good compilers (e.g. xlf90) will (at -O4) do higher order transforms of
> > the loop to introduce blocking, independent FMAs, ... that makes this
> > little piece of code about 100 times faster at O4 than O2 (what about
..
>
> Can you really deduce this freedom from later versions of the Fortran
> standard?
>
No, I'm only happy there are compilers that make my code 100 times faster
without doing a lot of work myself, keeping my code easy to maintain and
read.

Another example that relies on this kind of optimization that comes to my
mind is OMP/MPI code. There is just a large class of problems for which
this optimization is just what is needed.

BTW, timing of the code below on IBM SP4 with xlf90, would be useful to
see how gfortran performs.

O2:116.76s
O4:2.4s
O5:1.6s

Joost

INTEGER, PARAMETER :: N=1024
REAL*8 :: A(N,N), B(N,N), C(N,N)
REAL*8 :: t1,t2
A=0.1D0
B=0.1D0
C=0.0D0
CALL cpu_time(t1)
CALL mult(A,B,C,N)
CALL cpu_time(t2)
write(6,*) t2-t1,C(1,1)
END

SUBROUTINE mult(A,B,C,N)
REAL*8 :: A(N,N), B(N,N), C(N,N)
INTEGER :: I,J,K,N
DO J=1,N
DO I=1,N
DO K=1,N
  C(I,J)=C(I,J)+A(I,K)*B(K,J)
ENDDO
ENDDO
ENDDO
END


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]