This is the mail archive of the
gcc@gcc.gnu.org
mailing list for the GCC project.
(a+b)+c should be replaced by a+(b+c)
- From: Joost VandeVondele <jv244 at hermes dot cam dot ac dot uk>
- To: gcc at gcc dot gnu dot org
- Date: Thu, 25 Mar 2004 07:24:27 +0000 (GMT)
- Subject: (a+b)+c should be replaced by a+(b+c)
I think there is an obvious need for doing the optimization
(a+b)+c -> a+(b+c) in e.g. many scientific codes.
consider matrix multiply
do k=1,N
do j=1,N
do i=1,N
c(i,j)=c(i,j)+a(i,k)*b(k,j)
enddo
enddo
enddo
good compilers (e.g. xlf90) will (at -O4) do higher order transforms of
the loop to introduce blocking, independent FMAs, ... that makes this
little piece of code about 100 times faster at O4 than O2 (what about
LNO/SSA?). This can only be done if you allow (a+b)+c -> a+(b+c). It is
basically what any optimized blas routine will do. Matrix multiply is a
trivial example, if you want blas performance, call blas. There are many
other kernels like this in e.g. scientific code that are not blas. You
can't expect a scientist to hand unroll and block any kernel to the
appropriate depth for any machine. There need to be a compiler option to
do this. This can only be done if you allow (a+b)+c -> a+(b+c).
Joost