This is the mail archive of the
gcc@gcc.gnu.org
mailing list for the GCC project.
Re: (a+b)+c should be replaced by a+(b+c)
On Thu, 25 Mar 2004, Robert Dewar wrote:
> Joost VandeVondele wrote:
>
> > good compilers (e.g. xlf90) will (at -O4) do higher order transforms of
> > the loop to introduce blocking, independent FMAs, ... that makes this
> > little piece of code about 100 times faster at O4 than O2 (what about
..
>
> Can you really deduce this freedom from later versions of the Fortran
> standard?
>
No, I'm only happy there are compilers that make my code 100 times faster
without doing a lot of work myself, keeping my code easy to maintain and
read.
Another example that relies on this kind of optimization that comes to my
mind is OMP/MPI code. There is just a large class of problems for which
this optimization is just what is needed.
BTW, timing of the code below on IBM SP4 with xlf90, would be useful to
see how gfortran performs.
O2:116.76s
O4:2.4s
O5:1.6s
Joost
INTEGER, PARAMETER :: N=1024
REAL*8 :: A(N,N), B(N,N), C(N,N)
REAL*8 :: t1,t2
A=0.1D0
B=0.1D0
C=0.0D0
CALL cpu_time(t1)
CALL mult(A,B,C,N)
CALL cpu_time(t2)
write(6,*) t2-t1,C(1,1)
END
SUBROUTINE mult(A,B,C,N)
REAL*8 :: A(N,N), B(N,N), C(N,N)
INTEGER :: I,J,K,N
DO J=1,N
DO I=1,N
DO K=1,N
C(I,J)=C(I,J)+A(I,K)*B(K,J)
ENDDO
ENDDO
ENDDO
END